v0.29.0 issues deploying new subgraphs

trader-payne commented 1 year ago

Do you want to request a feature or report a bug? A bug

What is the current behavior? The indexer agent sends json-rpc commands to create new deployments All the ipfs calls are successful but the index-node doesn't proceed in syncing those subgraphs.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem. Use graphnode v0.29.0 Deploy subgraphs using the indexer agent (offchain sync)

What is the expected behavior? The deployed subgraphs should be assigned to an index-node and should start syncing

Logs attached here:

trader-payne commented 1 year ago

Managed to narrow down the issue to this:

use indexer agent sha-2ae5fb6
use graphnode 0.29.0
deploy a new subgraph via indexer-agent offchain sync env var
let the subgraph sync for a few mins, then remove the subgraph from the indexer-agent offchain sync env var
restart the agent, then try to add that subgraph back
notice that the graphnode doesn't reassign the subgraph

The issue can be reproduced with any existing subgraphs, but it works just fine to deploy new subgraphs.

EDIT: Looks like manually assigning subgraphs via json rpc commands still works - eg: http post index-node-0:8020 jsonrpc="2.0" id="1" method="subgraph_reassign" params:='{"ipfs_hash": "Qmadj8x9km1YEyKmRnJ6EkC2zpJZFCfTyTZpuqC3j6e1QH", "node_id": "index_node_0"}'

leoyvens commented 1 year ago

Hey @trader-payne, to confirm, this does not reproduce with 0.28.2?

trader-payne commented 1 year ago

Hey @leoyvens I just tried now, and it looks like the issue is present on 0.28.2 as well.

neysofu commented 1 year ago

Thanks for verifying @trader-payne. Given that it's not a regression, we'll most likely not fix this before releasing v0.29.0 to mainnet. We think it may be a race condition in the store, so if you take some time before restarting the indexer agent and readding the deployment to the offchain sync list, it's possible the issue might go away by itself.

Keeping this open though to track it in the future

trader-payne commented 1 year ago

So I've got into a worse issue now:

offchain sync a subgraph
stop it (remove it from offchain sync list)
try to allocate against it
the indexer-agent allocates
the subgraph is never moved to a valid node-id and it stays on "removed"
thus the subgraph never syncs

And just to confirm, restarting the indexer-agent or the index-node, doesn't do anything

I think this needs to be fixed before we push Gnosis to mainnet. cc @fordN @PedroMD

tilacog commented 1 year ago

I've been able to reproduce the issue.

Steps taken

start graph-node connected to Ethereum Mainnet and IPFS.

deploy the Epoch Subgraph using curl through graph-node JRPC endpoint

// deploy the subgraph
{
  "jsonrpc": "2.0",
  "method": "subgraph_create",
  "params": {
    "name": "foo"
  },
  "id": "1"
}

// deploy the subgraph
{
  "jsonrpc": "2.0",
  "method": "subgraph_deploy",
  "params": {
    "name": "foo",
    "ipfs_hash": "QmQGFaipA7HekgnogJqQ7SrU2FuYxt1KH7jifqEkmjX5aA",
    "node_id": "default"
  },
  "id": "1"
}

reassign the subgraph to a non-existing node, effectively pausing it

{
  "jsonrpc": "2.0",
  "method": "subgraph_reassign",
  "params": {
    "ipfs_hash": "QmQGFaipA7HekgnogJqQ7SrU2FuYxt1KH7jifqEkmjX5aA",
    "node_id": "default_paused"
  },
  "id": "1"
}

deploy the subgraph again, omitting the node_id

{
  "jsonrpc": "2.0",
  "method": "subgraph_deploy",
  "params": {
    "name": "foo",
    "ipfs_hash": "QmQGFaipA7HekgnogJqQ7SrU2FuYxt1KH7jifqEkmjX5aA"
  },
  "id": "1"
}

Outcome

Graph Node accepted the [re]deployment of the subgraph but didn’t assign it to a valid node_id, so it stays paused.

dev=# select * from subgraphs.subgraph_deployment_assignment;
-[ RECORD 1 ]-----------
node_id | default_paused
id      | 1

I'd say that this is not ideal from a UX point of view because it hides the fact that the deployment won't sync.

blocksteady commented 1 year ago

So I've got into a worse issue now:

offchain sync a subgraph

stop it (remove it from offchain sync list)

try to allocate against it

the indexer-agent allocates

the subgraph is never moved to a valid node-id and it stays on "removed"

thus the subgraph never syncs

And just to confirm, restarting the indexer-agent or the index-node, doesn't do anything

I think this needs to be fixed before we push Gnosis to mainnet. cc @fordN @PedroMD

I agree with payne on this one, after running into this issue after allocating to a a bunch of subgraphs I had to go through each deployment and check that it was indexing. I found that a few weren't and had to go about manually assigning them to an index node.

github-actions[bot] commented 1 year ago

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

graphprotocol / graph-node

v0.29.0 issues deploying new subgraphs #4242

Steps taken

Outcome