Closed michaelquigley closed 4 years ago
This may actually be a transmitter-side issue:
[ 90.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 21.3 MB/sec [:5028, 0 pending]
[ 91.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 21.1 MB/sec [:4982, 0 pending]
[ 92.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 18.9 MB/sec [:4472, 0 pending]
[ 93.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 11.7 MB/sec [:2761, 0 pending]
[ 94.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
[ 95.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
[ 96.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
[ 97.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
[ 98.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
[ 99.865] INFO dilithium/protocol/loop.(*transferReporter).run: [tx] 0 B/sec [:-1, 0 pending]
Appears to be related to a large, out-of-date rxPortalSz
value in the txPortal
. And, when there is nothing to be monitored for retx, we will not provoke any ACK traffic, which leads to the stalled condition.
Determine the root cause of the tx/rx stall.
Now that the
rxPortalSz
signalling is in place and the flow control appears to be in a stable place, tuning-wise... We need to figure out the root cause of these stalls: