During poll, expired timers and pending network events are collected into loop.callbacks, callback functions waiting for the events are scheduled, and then also executed (processCallbacks() in asyncloop.nim). When such a callback then proceeds to call waitFor, it re-enters poll while processCallbacks() is already ongoing.
If such a reentrancy condition occurs, loop.callbacks may still have items left. If they correspond to pending network events that have not yet been processed, the recursive poll call will re-discover that those events are still pending (loop.selector.selectInto). This then leads to network event callbacks being scheduled multiple times.
Certain network event callbacks do not operate properly when the underlying network event reflects a condition that is no longer accurate. For example, in transports/stream.nim, the function readStreamLoop expects to be called only when the underlying transp.buffer has space available, and crashes with an IndexError when it is called while that buffer is already full (because addr transp.buffer[transp.offset] on transp.offset == transp.buffer.len is invalid).
vendor/nim-chronos/chronos/asyncloop.nim(1124) waitFor
vendor/nim-chronos/chronos/asyncloop.nim(286) poll
vendor/nim-chronos/chronos/transports/stream.nim(1376) readStreamLoop
vendor/nimbus-build-system/vendor/Nim/lib/system/chcks.nim(23) raiseIndexError2
vendor/nimbus-build-system/vendor/Nim/lib/system/fatal.nim(50) sysFatal
Unhandled exception: index 4096 not in 0 .. 4095 [IndexError]
This condition can be experienced, for example using asyncTest in network heavy tests that operate on large amounts of data. It occurs from time to time on CI for nimbus-eth2.
macOS is affected for sure (probably Unix in general). I have not reviewed the Windows implementation.
During
poll
, expired timers and pending network events are collected intoloop.callbacks
, callback functions waiting for the events are scheduled, and then also executed (processCallbacks()
inasyncloop.nim
). When such a callback then proceeds to callwaitFor
, it re-enterspoll
whileprocessCallbacks()
is already ongoing.If such a reentrancy condition occurs,
loop.callbacks
may still have items left. If they correspond to pending network events that have not yet been processed, the recursivepoll
call will re-discover that those events are still pending (loop.selector.selectInto
). This then leads to network event callbacks being scheduled multiple times.Certain network event callbacks do not operate properly when the underlying network event reflects a condition that is no longer accurate. For example, in
transports/stream.nim
, the functionreadStreamLoop
expects to be called only when the underlyingtransp.buffer
has space available, and crashes with anIndexError
when it is called while that buffer is already full (becauseaddr transp.buffer[transp.offset]
ontransp.offset == transp.buffer.len
is invalid).This condition can be experienced, for example using
asyncTest
in network heavy tests that operate on large amounts of data. It occurs from time to time on CI fornimbus-eth2
.macOS is affected for sure (probably Unix in general). I have not reviewed the Windows implementation.