cookyshi / libjingle

Automatically exported from code.google.com/p/libjingle
0 stars 0 forks source link

Some packet loss patterns can cause data corruption at pseudo tcp startup. #263

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
This is a pretty subtle one. This isn't going to be the greatest bug report, 
but at least it may point others who are having this problem in the right 
direction.

What steps will reproduce the problem?

Run a pseudo tcp session with packet loss.

The specific pattern of loss that seems to cause this problem is if a data 
block arrives before the initial control block.

What is the expected output? What do you see instead?

Sometimes one of the first blocks of data will be offset by 4 bytes.

What is going on.
Pseudo TCP puts both control and data packets in its receive buffer. When a 
control packet arrives, pseudotcp changes the offset at which it will write 
future data in the buffer, but does not remove the control packet from the 
buffer. This is fine if packets arrive in order: subsequent data simply 
overwrites the control data. However, if the data arrives before the control 
packet, its offset in the buffer becomes incorrect.

Fixing the problem.
(Sorry, no patch, I'm working on a pretty hacked up version of pseudo TCP)

The following fix works because in the current implementation, all control 
packets have lower sequence numbers than all data packets. If that was no 
longer to be the case, I believe that this fix would fail if there was real 
data in the buffer ahead of the control data.

In PseudoTcp::process, replace:
   if (seg.len > 0) {
     if (bIgnoreData) {
       if (seg.seq == m_rcv_nxt) {
         m_rcv_nxt += seg.len;
       }
     } else {

with:
   if (seg.len > 0) {
     if (bIgnoreData) {
       if (seg.seq == m_rcv_nxt) {
         m_rbuf.ConsumeWriteBuffer(seg.len);
         m_rbuf.ConsumeReadData(seg.len); 
         m_rcv_nxt += seg.len;
       }
     } else {

Original issue reported on code.google.com by bla...@suitabletech.com on 22 Dec 2011 at 10:53

GoogleCodeExporter commented 9 years ago
Can you provide a unit test that demonstrates this problem?

Original comment by juberti@google.com on 4 Jan 2012 at 8:27

GoogleCodeExporter commented 9 years ago
Unfortunately, not easily. The test I have that revealed this problem relies on 
a lot of proprietary code. If there had been some tests for pseudotcp in 
libjingle, I could probably have added one in reasonable time, but I can't 
afford the time to put this stuff together from scratch. Apologies.

Original comment by bla...@suitabletech.com on 5 Jan 2012 at 6:14