streamingfast / substreams-sink-files

Binary application to consume your Substreams and output it's data out to files format (JSON, CSV, etc.)
Apache License 2.0
2 stars 4 forks source link

Files not written with larger `--file-block-count` #6

Open YaroShkvorets opened 1 year ago

YaroShkvorets commented 1 year ago

Sink process completes successfully but in some cases no files are written at all, in other cases, partially written .tmp are left, i.e.

$ l ./localdata/out
total 12096
drwxr-xr-x  3 shkvo  staff    96B 31 Aug 09:30 .
drwxr-xr-x  4 shkvo  staff   128B 31 Aug 09:28 ..
-rw-r--r--  1 shkvo  staff   5.9M 31 Aug 09:30 0010000000-0010050000.jsonl.ONvfywFG.tmp

Works fine with default 10000 file block count.

Might have something to do with flushing buffered bundle (?)

Reproduce with 50_000 file block count:

substreams-sink-files run \
    mainnet.eth.streamingfast.io:443 \
    https://github.com/streamingfast/substreams-eth-token-transfers/releases/download/v0.4.0/substreams-eth-token-transfers-v0.4.0.spkg \
    jsonl_out \
    ./localdata/out \
    -c=50_000 \
    --encoder=lines \
    --file-working-dir="./localdata/working" \
    --state-store=./localdata/working/state.yaml \
    10_000_000:+50_000
YaroShkvorets commented 1 year ago

Basically the sink doesn't flush the last active file.

Looks like the app shuts down before running all functions registered in OnTerminating() that are supposed to do that.