streamingfast / substreams

Powerful Blockchain streaming data engine, based on StreamingFast Firehose technology.
Apache License 2.0
164 stars 45 forks source link

Data Nexus corrupted data #337

Closed Eduard-Voiculescu closed 11 months ago

Eduard-Voiculescu commented 1 year ago

Tooling

Endpoints

Content Creation / Tutorial / How to manage StreamingFast Substreams Stack

Debugging


Post-Mortem:

Data Nexus have merge blocks which are corrupted

2023-07-24 23:13:36    1.8 GiB 0014206400.dbin.zst

The real size should be

20.82 MiB  2023-03-01T18:44:21Z  0014206400.dbin.zst

The merge block 14206400 contains multiple times the same block sequence, like this block 14206400.

More precisely, the merge block 14206400 contains these block sequences:

14206400 - 14206495
14197100 - 14206499
Inside the merged block file there are 9496 blocks

Data Nexus has 2 Subgraph deployments of Uniswap v3. In their second Subgraph they encountered a panic:

ERRO Subgraph failed with non-deterministic error: failed to process trigger: Failed to process substreams block: Entity TokenHourData[0x35bd01fc9d6d5d81ca9e055db88dc49aa2c699a8-456908]: missing value for non-nullable field `periodStartUnix`, retry_delay_s: 133, attempt: 0, sgd: 369, subgraph_id: QmQJovmQLigEwkMWGjMT8GbeS2gjDytqWCGL58BEhLu9Ag, component: SubgraphInstanceManager

This lead use to investigate the issue with them to better understand how they are running the Subgraph.

First deployment of Uniswap v3 Subgraph: The first time that the Substreams saw the block 14206400, it generated an EntityChange of type Create and added 1 in the store of store_total_tx_count.

Because the merge block file 14206400 contains the block 14206400 twice, the second time that the Substreams saw the block 14206400, it generated an EntityChange of type Update and added another 1 in the store of store_total_tx_count and now the value is 2. When the Substreams saw the block 14206499 it flushed the store and the graph_out to disk.

Second deployment of Uniswap v3 Subgraph (pointing to the SAME filesystem, SAME Substreams cache, but DIFFERENT database): In this case, when the second Substreams reached the block 14206400, it read the Update EntityChange from the graph_out map output file and Panicked because the Entity did not exist.

matthewdarwin commented 12 months ago

This is resolved now with firehose-ethereum v2.0.0-rc1?

Eduard-Voiculescu commented 11 months ago

Do you have any feedback for this @sduchesneau?

maoueh commented 11 months ago

I think was fixed even before that. But yes latest version are stricter on read anyway. Let's close.