Open GoogleCodeExporter opened 8 years ago
I ran into this same assertion in 0.16.12 and have spent some time looking for
the cause. I'm not certain I've found it but I did spot something suspicious.
The asserted condition is (p->num_transmissions < m_sm->num_resends() + 1).
The num_resends() function just returns a session_level setting so it can be
assumed to be constant. Adding one accounts for the number of possible
fast_resends made for the packet. Thus the only way the condition can fail is
if p->num_transmissions exceeds some constant.
num_transmissions is only incremented in one place: resend_packet(), line 1770.
It represents the number of times the packet has been sent or re-sent.
resend_packet() returns a boolean to indicate whether the packet was indeed
re-sent. There's an early exit from the function at line 1760, returning
false; after which the assertion is checked at 1765 and num_transmissions is
incremented at 1770. In 1786 the packet is sent, and the error code from that
operation is checked at 1801 and if send_packet() produced an error then
resend_packet() returns false at 1806; if no error then it returns true at 1809.
The suspicious bit is that parse_sack() uses the return value of
resend_packet() at line 1404 to break out of a loop that increments
m_fast_resend_seq_nr after each iteration. Suppose that the send_packet() call
in line 1786 of resend_packet() produces an error; then resend_packet() will
have incremented num_transmissions but returned false, and parse_sack() will
break out of the loop without incrementing m_fast_resend_seq_nr. This will
cause the packet to still be in the fast_resend range but with an increased
num_transmissions. If there are several fast_resend attempts in a row with
failed send_packet() calls, then the limit could be exceeded.
Like I said, I'm not confident this is the actual cause but it might be worth a
close look. If it is the cause, a possible fix would be to decrement
num_transmissions within the error code handling block within resend_packet(),
to reflect that the packet was in fact not successfully re-sent.
Original comment by alan...@gmail.com
on 11 Nov 2013 at 6:33
I should add that I've had difficulty reproducing the assertion for debugging
purposes. It's one of those that's rare enough to be hard to replicate, but
frequent enough to be an impediment.
Original comment by alan...@gmail.com
on 11 Nov 2013 at 6:44
Update: I have been able to reproduce the assertion although it takes many
hours of running libtorrent for it to occur. Unfortunately, my suggested fix
(to decrement num_transmissions when the send_packet() call in resend_packet()
fails) did not prevent the assertion from occurring.
Original comment by alan...@gmail.com
on 15 Nov 2013 at 11:49
Thank you very much for your effort. I just comment out it. Will it
generate new issue without it?
Original comment by ygao....@gmail.com
on 16 Nov 2013 at 3:07
Original issue reported on code.google.com by
ygao....@gmail.com
on 8 Nov 2013 at 8:36