Closed sduchesneau closed 1 year ago
This branch contains work in progress, but does not address how the squasher should fill in the holes and then restart normally... https://github.com/streamingfast/substreams/tree/feature/missing-ranges
Latest changes to feature/missing-ranges
makes it that if you delete, let's say, the file 0-200.kv and then run (production mode or not)
substreams run map_eth_stats --plaintext -e localhost:9000 -s 200 -t +1 --production-mode
This will work. BUT if you run it with a start block not at the exact same height of the kv file deleted then it will create this kind of output locally:
0000000100-0000000000.54e06f8b1e0a145cf525ccf966e15bda.partial.zst
0000000100-0000000000.8a8a1fbabfe985ce9adad621dfa15da0.partial.zst
0000000100-0000000000.kv.zst
0000000200-0000000000.kv.zst
0000000200-0000000100.54e06f8b1e0a145cf525ccf966e15bda.partial.zst
0000000200-0000000100.8a8a1fbabfe985ce9adad621dfa15da0.partial.zst
0000000300-0000000000.kv.zst
0000000400-0000000000.kv.zst
0000000500-0000000000.kv.zst
0000000600-0000000000.kv.zst
0000000700-0000000000.kv.zst
0000000800-0000000000.kv.zst
0000000900-0000000000.kv.zst
0000001000-0000000000.kv.zst
To fix we would need to relaunch a squasher for that specific range, basically have a concept of store squashers by module AND by range...
Will be addressed with rework: https://github.com/streamingfast/substreams/issues/226
This issue has been fixed with #226 and deployed to our production endpoint.
This happens when the backend cache is missing a full-kv file in its sequence (ex: 10, 20, 40).
It is not a scenario that should happen during normal operations, but if someone is moving kv files around, this case is not handled well.
Here's an example log entry:
A workaround is to delete the cache, but it is inefficient.
Wanted behavior: the substreams engine should schedule the missing jobs and inform the "squasher" that it needs to process these missing file segments.
How to reproduce locally
In one terminal run (in firehose-ethereum devel folder):
In another terminal run:
Then in another terminal run any substreams you want.
Manually delete any full kv
Run again your substreams but with a different boundary (production mode or not)