Hi, yibo
I think I found a small bug in your codes in the ReceiverCheckSeq function in qbb-net-device.cc, it does nothing when seq<expected, which means that the NIC receives a duplicate data packets. Let's think of a condition, when the ack(n=4000) lost and the sender didn't receives the ack, so it waits for a period of time, then it began to retransmit, unfortunately, the receiver will do nothing when it receives the duplicate data packets so the sender will never receives the ack(n=4000). This is what I met when I set the loss rate to 0.01 determinately(drop 1 per 100 packets passes the switch), and the ack(n=16000) get lost, thus cause the network a livelock. I think the algorithm should check the seq even though seq<expected. And when (seq+1)%m_chunk==0, the receiver will send back a "duplicate" ack to the sender.
Hi, yibo I think I found a small bug in your codes in the ReceiverCheckSeq function in qbb-net-device.cc, it does nothing when seq<expected, which means that the NIC receives a duplicate data packets. Let's think of a condition, when the ack(n=4000) lost and the sender didn't receives the ack, so it waits for a period of time, then it began to retransmit, unfortunately, the receiver will do nothing when it receives the duplicate data packets so the sender will never receives the ack(n=4000). This is what I met when I set the loss rate to 0.01 determinately(drop 1 per 100 packets passes the switch), and the ack(n=16000) get lost, thus cause the network a livelock. I think the algorithm should check the seq even though seq<expected. And when (seq+1)%m_chunk==0, the receiver will send back a "duplicate" ack to the sender.