streamingfast / merger

Apache License 2.0
4 stars 5 forks source link

Hash collision in merger #18

Closed matthewdarwin closed 2 years ago

matthewdarwin commented 2 years ago

My Ethereum mainnet merger wax stuck merging blocks.

2022-07-11T04:14:37.946Z INFO (merger) retrieved list of files {"seen_files_count": 1546, "too_old_files_count": 0, "added_files_count": 2, "highest_linkable_block_file": 15118979, "highest_seen_block_file": 15119138}
2022-07-11T04:14:37.957Z INFO (merger) bundle not completed after retrieving one block file {"bundle": {"bundle_size": 100, "inclusive_lower_block_num": 15118900, "exclusive_highest_block_limit": 15119000, "highest_linkable_block_num": 15118979, "highest_linkable_block_id": "7cfa3aae", "lib_num": 15118938, "lib_id": "f337b540", "longest_chain_length": 380}}

After discussion with @sduchesneau it seems that there is a collision in the short block IDs that the merger uses. The one block files use a very short 4-byte identifier for the block hash. The hash collision must happen between a ~300 blocks range to impact the merger.

To make this problem less frequent (avoid entirely), need to make the hash 2^32 or 2^64 or so.

sduchesneau commented 2 years ago

This will be fixed as part of https://github.com/streamingfast/bstream/issues/22

we can remove the BlockTime (~12 characters) and increase the blockID section to use ~4-6 more bytes.