Closed maoueh closed 11 months ago
SessionInit { trace_id: "206f069bff5bbb58cfbee72f1c2e7a1c", resolved_start_block: 18280701, linear_handoff_block: 18286400, max_parallel_workers: 10 }
First blocks where we get data are: 18280694 18280695 18280699 18280709
He gets the issue at block 18280708
so assuming that his substreams doesn't see the data before.
Ran on pinax endpoint with: substreams gui https://github.com/streamingfast/substreams-uniswap-v3/releases/download/v0.2.7/substreams.spkg graph_out -e eth.substreams.pinax.network:9000 -s 18280000 -t +1000
and the output is the same as what we see on our endpoint
There was no cursor when starting the request, blocks received:
18280709
it fails.What if Payne rewinds up to block 18280690 ?
He rewound to 18280701
, I see that from the logs. Forgot to attach them:
2iSqZNcp.log
After further investigation, it appears the error is probably because the graph-node
db is "broken" in some ways. Indeed, this error happens because that most probably, the CREATE
entity change was not written correctly to the DB (corruption, WAL problem, etc.). When graph-node
arrives at block 18280709
, it tries to merge the entity UPDATE
is receives with the current entity. The Update
did not contain the periodStartUnix
because this field is present only on the CREATE
one.
The merged entity is then validated at validate which leads to an error because it sees that periodStartUnix
is not set properly.
note: Since there were other updates of that entity between its creation at block 18280694
and the error at block 18280709
that did not generate an issue, we can determine that substreams had correctly sent the CREATE to the graph-node, before the corruption event happened, and that his event deleted some data that predates the rewind target.
Rewinded to block 18270600
and it got me past the error, I'm up to chainhead now.
Curious why the error triggered in the first place though. Since it's non-deterministic too.
From David Lutter:
Hmm. Since TokenHourData gets updated throughout each hour, I wonder if it worked because you had to rewind past the hour where the problem happened. IIRC, substreams only sends the changes and since the timestamp doesn't change, it will only be sent when the hour starts. Hadn't thought of that before.
Hmm. Since TokenHourData gets updated throughout each hour, I wonder if it worked because you had to rewind past the hour where the problem happened. IIRC, substreams only sends the changes and since the timestamp doesn't change, it will only be sent when the hour starts. Hadn't thought of that before.
It is true that Substreams, when you reconnect at 18280701
would stream only the UPDATE
portion. But if the CREATE
was part of the database, no problem would have happen. For this error to show up, it means that the CREATE
was missed completely.
When you rewind let's say at 18280700
, it should rewind only up to this point, not before and theCREATE
from the Substreams should still be in the database.
Interesting. So if CREATE
was missed completely, would that be an issue on the graph-node side? Or the substreams side
Not clear what happen before the rewind, could be either one. Streaming Pinax endpoint seems to emit the correct flow of events from what we see, so seems unlikely that is was that, plus the fact that you needed to rewind
.
Here the QM hash:
QmQJovmQLigEwkMWGjMT8GbeS2gjDytqWCGL58BEhLu9Ag
.Payne: