algorand / conduit

Algorand's data pipeline framework.
MIT License
37 stars 26 forks source link

Algod Importer: Longer Timeout #133

Closed tzaffi closed 1 year ago

tzaffi commented 1 year ago

Change waitForRoundTimeout to 30 from 5

  1. changing the amount of time to wait for the next round
  2. adding more context to the SyncError type

Explanation for the timeout change

During a recent EC2 import test, I noticed that after catching up, around 1% of imports are reporting an error/warning:

{
"__type":"importer",
"_name":"algod",
"level":"warning",
"msg":"Sync error detected, attempting to set the sync round to recover the node: wrong round returned from status for round: 0 != 31117945",
"time":"2023-08-07T15:31:16.166545875Z"
}
{
"__type":"Conduit",
"_name":"main",
"level":"error",
"msg":"wrong round returned from status for round: 0 != 31117945",
"time":"2023-08-07T15:31:16.166783576Z"
}

The pipelining branch, which added some more information to the SyncError errored similarly with the warning msg:

importer algod.GetBlock() sync error detected, attempting to set the sync round to recover the node: wrong round returned from status for round: 31112472 != 31112473: status2.LastRound mismatch: context deadline exceeded

So we can see that we're getting a timeout error. This is to be expected on occasion as when calling algod's StatusAfterBlock(), a context is provided with a 5 sec timeout.

Having carried out some statistical analysis on the last 1 million rounds, I got the following results:

mean_duration stdev_duration min_duration max_duration 0-4 5-9 10-14 15-19 20-24
3.400925599 0.507927656 0 22 999803 148 0 0 50

So if we re-set waitForRoundTimeout = 30, we're likely to encounter 0 such timeouts, or at most a small number.

On the other hand, algod's own timeout is 60 secs so we are steering quite far from the upper bound.

codecov[bot] commented 1 year ago

Codecov Report

Merging #133 (b9fd399) into master (442791a) will increase coverage by 2.86%. Report is 48 commits behind head on master. The diff coverage is 77.77%.

@@            Coverage Diff             @@
##           master     #133      +/-   ##
==========================================
+ Coverage   67.66%   70.53%   +2.86%     
==========================================
  Files          32       36       +4     
  Lines        1976     2545     +569     
==========================================
+ Hits         1337     1795     +458     
- Misses        570      653      +83     
- Partials       69       97      +28     
Files Changed Coverage Δ
conduit/data/block_export_data.go 100.00% <ø> (+92.30%) :arrow_up:
conduit/metrics/metrics.go 100.00% <ø> (ø)
conduit/pipeline/metadata.go 69.11% <ø> (ø)
conduit/plugins/config.go 100.00% <ø> (ø)
...duit/plugins/exporters/filewriter/file_exporter.go 81.63% <ø> (-1.06%) :arrow_down:
conduit/plugins/importers/algod/metrics.go 100.00% <ø> (ø)
...gins/processors/filterprocessor/fields/searcher.go 77.50% <ø> (ø)
...ins/processors/filterprocessor/filter_processor.go 83.82% <ø> (+3.54%) :arrow_up:
...plugins/processors/filterprocessor/gen/generate.go 34.28% <ø> (ø)
conduit/plugins/processors/noop/noop_processor.go 64.70% <ø> (+6.81%) :arrow_up:
... and 19 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more