Open teor2345 opened 1 year ago
Due to this block verification timeout bug, full syncs are currently taking 40-48 hours. That's still acceptable performance, I think?
Is this still happening? if so, when was the last time we saw it?
Is this still happening? if so, when was the last time we saw it?
Yes, it happens multiple times in every full sync, this sync is running right now: https://github.com/ZcashFoundation/zebra/actions/runs/4931890966/jobs/8824296617#step:8:8745
In the last completed full sync, it happened at least 200 times, which causes a delay of 3-5 hours in 50 hour sync: https://github.com/ZcashFoundation/zebra/actions/runs/4888643906/jobs/8769200765#step:8:17512
@ZcashFoundation/zebra-team has anyone noticed this happening recently? I'm tempted to close
Yes, we still have this bug. I got a few of the errors while running an unsynced instance locally:
2024-03-13T13:23:23.783618Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1941142), hash: block::Hash("0000000000aca7473124dd16884dfe35ee602ba7f9dda66bc2633563c5db3a4e") }
2024-03-13T13:34:23.516921Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1942686), hash: block::Hash("0000000000ffb91722514ad09b493355add5b8f5559600b7dd7c7b715cdc80a2") }
2024-03-13T13:44:11.547760Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1944827), hash: block::Hash("0000000001b3c62ba33f778464b308a0435f25a2afa995d1bbbfa553e29c9878") }
2024-03-13T15:31:57.838278Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1977421), hash: block::Hash("00000000016daccf7867a7b7ad930e118d5f5cd2b13612eea2df038c7aa9747b") }
2024-03-13T15:41:43.896349Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1978958), hash: block::Hash("00000000006886ebe06217ded4b4718af82549afab6a1a66354c34cec93882be") }
2024-03-13T15:51:26.100828Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1980798), hash: block::Hash("00000000016422ba324158bd0ed9597dbcdec3a35ee94f08968a92b2440e1b9a") }
2024-03-13T16:01:18.321517Z WARN {zebrad="58bfe97" net="Main"}:sync:try_to_sync:try_to_sync_once{extra_hashes={}}: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1982826), hash: block::Hash("0000000000db2bf60283410f90846223e981d3a4c0d5d9f656804be36067d163") }
The timing is interesting and doesn't seem random. It looks like every few hours, we get a series of error logs that are ten minutes apart.
Motivation
During the initial sync, Zebra's block validation repeatedly times out, and the syncer resets:
This only happens in some parts of the chain, and it seems to happen at slightly different block heights each time. When it does happen, it can happen repeatedly every 10-15 minutes: https://github.com/ZcashFoundation/zebra/actions/runs/3519147385/jobs/5923785413#step:8:12465
This is not a priority because Zebra is still much faster than
zcashd
.Designs
This could be happening because:
Related Work
If we speed up block commits, it might solve this issue:
5425