Open ethouris opened 3 years ago
The likely cause of the problem:
In the BACKUP groups the IDLE links get updated with the sequence numbers so that when they become activated they send the same packet as the other link with the same sequence number. In general this synchronization happens in three places:
Suspected is the procedure that does the synchronization in a situation when the sender buffer contains more packets than the distance between the last synchronization at connection or periodic-keepalive time, which means that the next packet that is going to be sent over the activated link has a sequence number BACKWARDS towards the current scheduling sequence number that resulted from synchronization.
There are two possibilities of how to fix it:
overrideSeqNo
function, just limit the value by which the sequence numbers may differ.
The problem:
It may happen due to not exactly clean conditions to happen that the overriding of the sequence needs to be done. However, if the overriding would go in wrong direction, it is denied. This is checked in this part of code:
It has been observed sometimes that this override is attempted to happen. This denial must be recognized and reacted, as a link for which this override wasn't done, would be unusable and could cause a break for the whole group. Either we could fix this denial somehow (fix anyway, at least for some narrower range), or if not possible, then the link that was denied overriding must be immediately closed, at least for security reasons (this sequence discrepancy will be detected again and this whole procedure will be repeated).