graphprotocol / graph-node

Graph Node indexes data from blockchains such as Ethereum and serves it over GraphQL
https://thegraph.com
Apache License 2.0
2.89k stars 962 forks source link

v0.29.0 issues deploying new subgraphs #4242

Open trader-payne opened 1 year ago

trader-payne commented 1 year ago

Do you want to request a feature or report a bug? A bug

What is the current behavior? The indexer agent sends json-rpc commands to create new deployments All the ipfs calls are successful but the index-node doesn't proceed in syncing those subgraphs.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem. Use graphnode v0.29.0 Deploy subgraphs using the indexer agent (offchain sync)

What is the expected behavior? The deployed subgraphs should be assigned to an index-node and should start syncing

Logs attached here:

trader-payne commented 1 year ago

Managed to narrow down the issue to this:

The issue can be reproduced with any existing subgraphs, but it works just fine to deploy new subgraphs.

EDIT: Looks like manually assigning subgraphs via json rpc commands still works - eg: http post index-node-0:8020 jsonrpc="2.0" id="1" method="subgraph_reassign" params:='{"ipfs_hash": "Qmadj8x9km1YEyKmRnJ6EkC2zpJZFCfTyTZpuqC3j6e1QH", "node_id": "index_node_0"}'

leoyvens commented 1 year ago

Hey @trader-payne, to confirm, this does not reproduce with 0.28.2?

trader-payne commented 1 year ago

Hey @leoyvens I just tried now, and it looks like the issue is present on 0.28.2 as well.

neysofu commented 1 year ago

Thanks for verifying @trader-payne. Given that it's not a regression, we'll most likely not fix this before releasing v0.29.0 to mainnet. We think it may be a race condition in the store, so if you take some time before restarting the indexer agent and readding the deployment to the offchain sync list, it's possible the issue might go away by itself.

Keeping this open though to track it in the future

trader-payne commented 1 year ago

So I've got into a worse issue now:

And just to confirm, restarting the indexer-agent or the index-node, doesn't do anything

I think this needs to be fixed before we push Gnosis to mainnet. cc @fordN @PedroMD

tilacog commented 1 year ago

I've been able to reproduce the issue.

Steps taken

Outcome

Graph Node accepted the [re]deployment of the subgraph but didn’t assign it to a valid node_id, so it stays paused.

dev=# select * from subgraphs.subgraph_deployment_assignment;
-[ RECORD 1 ]-----------
node_id | default_paused
id      | 1

image

I'd say that this is not ideal from a UX point of view because it hides the fact that the deployment won't sync.

blocksteady commented 1 year ago

So I've got into a worse issue now:

  • offchain sync a subgraph
  • stop it (remove it from offchain sync list)
  • try to allocate against it
  • the indexer-agent allocates
  • the subgraph is never moved to a valid node-id and it stays on "removed"
  • thus the subgraph never syncs

And just to confirm, restarting the indexer-agent or the index-node, doesn't do anything

I think this needs to be fixed before we push Gnosis to mainnet. cc @fordN @PedroMD

I agree with payne on this one, after running into this issue after allocating to a a bunch of subgraphs I had to go through each deployment and check that it was indexing. I found that a few weren't and had to go about manually assigning them to an index node.

github-actions[bot] commented 1 year ago

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.