status-im / nim-chronos

Chronos - An efficient library for asynchronous programming
https://status-im.github.io/nim-chronos/
Apache License 2.0
362 stars 51 forks source link

connection dropped due to `EAGAIN` in `readStreamLoop` #314

Closed arnetheduck closed 2 years ago

arnetheduck commented 2 years ago

with some regularity, when using libp2p to connect to go-libp2p, I can lose the connection due to an EAGAIN in the read stream loop - this would indicate that there's a race condition because we should only be reading when there's data:

Breakpoint 3, osErrorMsg__pureZos_358 (errorCode=errorCode@entry=11) at /home/arnetheduck/status/mite/vendor/nimbus-build-system/vendor/Nim/lib/pure/includes/oserr.nim:18
18  proc osErrorMsg*(errorCode: OSErrorCode): string =
(gdb) bt full
#0  osErrorMsg__pureZos_358 (errorCode=errorCode@entry=11) at vendor/nimbus-build-system/vendor/Nim/lib/pure/includes/oserr.nim:18
        result = <optimized out>
        FR_ = {prev = 0xc, procname = 0x820f08 "vendor/nimbus-build-system/vendor/Nim/lib/system/dollars.nim", line = 393216, filename = 0x7ffff7c67da0 "\003", len = -13984, calldepth = -1, 
          frameMsgLen = 140737350357408}
#1  0x00000000004fce3a in readStreamLoop__OOZvendorZnim45chronosZchronosZtransportsZstream_1109 (udata=0x7ffff7c64050) at vendor/nim-chronos/chronos/transports/common.nim:570
        msgX60gensym68_ = <optimized out>
        T43_ = 0x0
        T45_ = 0x0
        treX60gensym68_ = <optimized out>
        T47_ = <optimized out>
        T44_ = 0x7ffff7c651a0
        T46_ = <optimized out>
        TM__2I7Lp9b9cDBRsWiKyn0XigmQ_29 = {prev = 0x7fffffffc9e0, status = 8629082, context = {{__jmpbuf = {32, 8528944, 786432, 8522840, 140737350554368, 9515128, 0, 4550924}, __mask_was_saved = -13776, __saved_mask = {__val = {
                  8527581, 169, 8522840, 140733194108928, 8526977, 179, 140737350554368, 140737350384872, 5065521, 140737488341616, 8522648, 243, 8522840, 140733194043392, 140737350554368, 140737350554368}}}}}
        err = 11
        res = <optimized out>
        T25_ = <optimized out>
        TM__2I7Lp9b9cDBRsWiKyn0XigmQ_24 = <optimized out>
        transp = 0x7ffff7c64050
        fd = 5
        _nimCurFrame = 0x7fffffffced0
        FR_ = {prev = 0x7fffffffcf60, procname = 0x82a0df "readStreamLoop", line = 570, filename = 0x829ca8 "vendor/nim-chronos/chronos/transports/common.nim", len = 0, calldepth = 5, frameMsgLen = 8561393}
        LA65_ = <optimized out>
        LA118_ = <optimized out>
#2  0x00000000004ec72a in poll__OOZvendorZnim45chronosZchronosZasyncloop_1742 () at vendor/nim-chronos/chronos/asyncloop.nim:288
        callableX60gensym68_ = {function = {ClP_0 = 0x4fc851 <readStreamLoop__OOZvendorZnim45chronosZchronosZtransportsZstream_1109>, ClE_0 = 0x0}, udata = 0x7ffff7c64050}
        TM__xRflM5lMv4YvZq7gOUBMew_311 = <optimized out>
        iX60gensym68_ = 2
        i_3 = 2
        loop = 0x7ffff7c6d050
        curTime = {value = 1}
        curTimeout = <optimized out>
        customSet = 60
        lastFinishX60gensym49_ = <optimized out>
        count_1 = 1
        curTimeX60gensym62_ = <optimized out>
        countX60gensym68_ = 3
        FR_ = {prev = 0x7fffffffd560, procname = 0x82856b "poll", line = 288, filename = 0x828758 "vendor/nim-chronos/chronos/asyncloop.nim", len = 0, calldepth = 4, frameMsgLen = 8912384}
arnetheduck commented 2 years ago

746832384a036dc159d3bd559c53d1594e7175c4

cheatfate commented 2 years ago

Run it with chronosFutureTracking and call dumpPendingFutures on error please.

arnetheduck commented 2 years ago

found the issue - my mistake, a nested poll call :/