threefoldtech / tfchain_graphql

Graphql for TFchain
Apache License 2.0
2 stars 3 forks source link

Bug: Procssor exits with AssertionError #181

Closed sameh-farouk closed 3 months ago

sameh-farouk commented 4 months ago

deploying TFChain squid today on QANet snapshot creator machine failed as the processor exited with an error. @hossnys and @coesensbert reported it.

The deployment on that machine has been rolled back pending a fix by the development team.

logs:

{"level":5,"time":1717688117149,"ns":"sqd:processor","err":{"generatedMessage":true,"code":"ERR_ASSERTION","actual":false,"expected":true,"operator":"==","stack":"AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:\n\n  (0, assert_1.default)(typeof value == \"boolean\")\n\n    at decodePrimitive (/squid/node_modules/@subsquid/scale-codec/lib/codec-json.js:164:34)\n    at JsonCodec.decode (/squid/node_modules/@subsquid/scale-codec/lib/codec-json.js:24:24)\n    at JsonCodec.decodeStruct (/squid/node_modules/@subsquid/scale-codec/lib/codec-json.js:92:35)\n    at JsonCodec.decode (/squid/node_modules/@subsquid/scale-codec/lib/codec-json.js:36:29)\n    at Chain.decodeTuple (/squid/node_modules/@subsquid/substrate-processor/lib/chain.js:150:35)\n    at Chain.decode (/squid/node_modules/@subsquid/substrate-processor/lib/chain.js:138:25)\n    at Chain.decodeEvent (/squid/node_modules/@subsquid/substrate-processor/lib/chain.js:128:21)\n    at get asV101 [as asV101] (/squid/lib/types/events.js:759:28)\n    at farmUpdated (/squid/lib/mappings/farms.js:146:51)\n    at handleEvents (/squid/lib/processor.js:100:73)"}}
error Command failed with exit code 1.
sameh-farouk commented 4 months ago

Update: I left a test instance syncing with QANet, so far so good.

{"level":2,"time":1717720300228,"ns":"sqd:processor","msg":"816499 / 2037359, rate: 3335 blocks/sec, mapping: 44762 blocks/sec, 1566 items/sec, ingest: 3869 blocks/sec, eta: 7m"}

I will see if I can reproduce the issue.

sameh-farouk commented 4 months ago

Update: A note unrelated to the original issue: restcountries.com API is unresponding and timeout when the init script tries to fetch the countries. This will lead to a lack of countries and cities data in squid data. A restart won't fix it. The processor would need to resync from 0 since this data is critical to exist at the beginning. I would change the init_countires script to exit with an error code instead of 0 to prevent the processor from starting when fetching the countries or cities data fails.

sameh-farouk commented 4 months ago

Update: restcountries.com API is back now.

sameh-farouk commented 4 months ago

@coesensbert @hossnys My test instance is still going well.

{"level":2,"time":1717756541668,"ns":"sqd:processor","msg":"2445019 / 2445039, rate: 150 blocks/sec, mapping: 895 blocks/sec, 9 items/sec, ingest: 115 blocks/sec, eta: 0s"}

Here is my stack versions

Screenshot_20240607_133533

Can you please confirm the processor image and tag that shows the error?

coesensbert commented 4 months ago

https://github.com/threefoldtech/grid_deployment/commit/f11a9f7026fa739d5137fbd01042bc45b9e4588c

sameh-farouk commented 4 months ago

Update: I performed a code review with the insight of the error sent to me by the operations team, and I fixed a potential bug with this pull request: PR. However, I am still unable to reproduce the error. I provided the operations team with a test image containing this fix to see if it helps with the reported error since I can't reproduce it. Additionally, I advised them to ensure they use the latest image and the latest typesBundle.json file. Another error occurs, but it's also not reproducible by me.

MarioBassem commented 3 months ago

as @AhmedHanafy725 asked, I deployed a machine on devnet and ran the indexer/processor stack for qanet, the processor has now processed all blocks on qanet, currently on block 10410560. it only stopped once, probably a connection issue, and when restarted, it continued without issues.

22:17:35 FATAL sqd:processor GraphqlError: GraphQL error: pool timed out while waiting for an open connection
                                 at ArchiveClient.graphqlRequest (/root/tfchain_graphql/node_modules/@subsquid/util-internal-http-client/lib/client.js:270:19)
                                 at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
                                 at async /root/tfchain_graphql/node_modules/@subsquid/substrate-processor/lib/ingest.js:52:32
                                 messages: [{"message":"pool timed out while waiting for an open connection","locations":[{"line":5,"column":5}],"path":["batch"]}]
                                 archiveQuery: query {
                                                   status {
                                                       head
                                                   }
                                                   batch(fromBlock: 4846080, toBlock: 4846139, includeAllBlocks: true, events: [{name: "TFTPriceModule.AveragePriceStored", data: {event: {args: true}}}, {name: "SmartContractModule.ContractUpdated", data: {event: {args: true}}}, {name: "TfgridModule.FarmCertificationSet", data: {event: {args: true}}}, {name: "SmartContractModule.ContractCreated", data: {event: {args: true}}}, {name: "TfgridModule.FarmPayoutV2AddressRegistered", data: {event: {args: true}}}, {name: "TfgridModule.PowerStateChanged", data: {event: {args: true}}}, {name: "SmartContractModule.NodeExtraFeeSet", data: {event: {args: true}}}, {name: "TfgridModule.PowerTargetChanged", data: {event: {args: true}}}, {name: "TfgridModule.FarmDeleted", data: {event: {args: true}}}, {name: "TfgridModule.NodeCertificationSet", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractFeesSet", data: {event: {args: true}}}, {name: "TfgridModule.NodePublicConfigStored", data: {event: {args: true}}}, {name: "TfgridModule.FarmUpdated", data: {event: {args: true}}}, {name: "TfgridModule.NodeUptimeReported", data: {event: {args: true}}}, {name: "TfgridModule.PricingPolicyStored", data: {event: {args: true}}}, {name: "TfgridModule.NodeDeleted", data: {event: {args: true}}}, {name: "TfgridModule.FarmStored", data: {event: {args: true}}}, {name: "TfgridModule.NodeUpdated", data: {event: {args: true}}}, {name: "SmartContractModule.ContractGracePeriodStarted", data: {event: {args: true}}}, {name: "TfgridModule.NodeStored", data: {event: {args: true}}}, {name: "TFTPriceModule.PriceStored", data: {event: {args: true}}}, {name: "Balances.Transfer", data: {event: {args: true}}}, {name: "SmartContractModule.RentContractCanceled", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractCanceled", data: {event: {args: true}}}, {name: "TfgridModule.FarmingPolicyUpdated", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractCreated", data: {event: {args: true}}}, {name: "TfgridModule.TwinUpdated", data: {event: {args: true}}}, {name: "SmartContractModule.SolutionProviderCreated", data: {event: {args: true}}}, {name: "SmartContractModule.ContractBilled", data: {event: {args: true}}}, {name: "SmartContractModule.UpdatedUsedResources", data: {event: {args: true}}}, {name: "SmartContractModule.SolutionProviderApproved", data: {event: {args: true}}}, {name: "SmartContractModule.NodeContractCanceled", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractMetadataSet", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractApproved", data: {event: {args: true}}}, {name: "SmartContractModule.NameContractCanceled", data: {event: {args: true}}}, {name: "SmartContractModule.NruConsumptionReportReceived", data: {event: {args: true}}}, {name: "TfgridModule.TwinStored", data: {event: {args: true}}}, {name: "SmartContractModule.ServiceContractBilled", data: {event: {args: true}}}, {name: "TfgridModule.FarmingPolicyStored", data: {event: {args: true}}}, {name: "SmartContractModule.ContractGracePeriodEnded", data: {event: {args: true}}}, {name: "TfgridModule.TwinDeleted", data: {event:
 {args: true}}}]) {
                                                       header {
                                                           id
                                                           height
                                                           hash
                                                           parentHash
                                                           timestamp
                                                           specId
                                                           stateRoot
                                                           extrinsicsRoot
                                                           validator
                                                       }
                                                       events
                                                       calls
                                                       extrinsics
                                                   }
                                               }

                                 batchRange: {"from":4846080}
                                 archiveHeight: 4846139
error Command failed with exit code 1.

and this is a screenshot of the current logs: Screenshot from 2024-06-12 11-21-26

MarioBassem commented 3 months ago

also, as @sameh-farouk mentioned, the init_countires script didn't work too

sameh-farouk commented 3 months ago

I'm closing this issue. Ops was using an incorrect typesBundle file, and the indexer db needed to be reset, which led to encountering this issue.