streamingfast / merger

Apache License 2.0
4 stars 5 forks source link

merger needs to read more than 2000 files (limit should be configurable) #14

Closed matthewdarwin closed 2 years ago

matthewdarwin commented 2 years ago

merger stopped merging blocks on goerli testnet. (merger is running, it is just not doing anything... one-blocks are piling up in s3) Last few lines before it stopped merging are:

2022-04-10T12:11:33.051Z DEBG (merger) adding one block file {"file_name": "0006690896-20220410T121111.0-f9a54127-8a9b46c0-6690696"}
2022-04-10T12:11:33.051Z DEBG (merger) adding one block file {"file_name": "0006690897-20220410T121126.0-10035e85-e7ccd3ca-6690697"}
2022-04-10T12:11:33.051Z DEBG (merger) setting lib value {"current_block_num": 6690897, "lib_num_candidate": 6690697}
2022-04-10T12:11:33.051Z INFO (merger) retrieved list of files {"too_old_files_count": 0, "added_files_count": 2000}

It got to 2000 files, but it is only at block 97.

mc ls ceph/goerli-dfuse-one-blocks/00066906 | wc -l
631

$ mc ls ceph/goerli-dfuse-one-blocks/00066907 | wc -l
659

$ mc ls ceph/goerli-dfuse-one-blocks/00066908 | wc -l
729

Adding all those up, it needs to read at least 2019 files. Note: there are 3 mindreaders running, so minimum number of files is 3 x 3 x 100 = 900.

Full log file available upon request.

sduchesneau commented 2 years ago

New code deployed to develop, merged into release branch of sf-ethereum: Now, the number of '2000' is applied to "new files that are in range of the bundle or above it", and it dedupes the files from multiple mindreaders, and on the next pass, it reads 2000 new files if previous didn't contain full bundle. Also, logs have been improved to show the progress better. Should behave as expected in all situations!

matthewdarwin commented 2 years ago

Looks good. System caught up quickly and remove all the old files too

image