Closed serges147 closed 2 months ago
The !same_transport
condition is redundant because a frame with a matching transfer-ID would be accepted if it was the same iface anyway. This leaves us with two preconditions: whether the transfer-ID matches the expectation and whether the reassembly state machine is not currently mid-transfer.
The removal of the constraint that a new transfer must arrive from the same interface as the previous one (the same can be achieved by removing the correct_iface
constraint from rxSessionUpdate
) is generally not sound, because it may cause transfer duplication if the non-current interface is lagging behind the current one by exactly 32 transfers, which may occur in certain failure modes.
A possible solution that does not introduce undesirable side effects is to add a new acceptance condition:
const bool restartable = (same_transport && tid_new) || // unchanged
(same_transport && tid_timeout) || // unchanged
(tid_timeout && tid_new) || // unchanged
(tid_timeout && tid_match && idle); // new
where:
const bool tid_match = rxs->transfer_id == frame->transfer_id;
const bool idle = 0U == rxs->total_payload_size;
The purpose of tid_match
is to eliminate duplicates arriving from redundant interfaces and to make the condition more selective such that it only activates if the transfers arrive in strictly the correct ordering. Without this check, per the diagram, we would accept the transfer with TID=0 twice. The idle
condition is needed to prohibit changing the active interface mid-transfer as it may cause both old and new transfers to be lost.
Can you please check this?
Currently I observe RPC client response timeouts b/c of periodic drops of incoming RX frames. Here is PDF diagram build on
candump
analysis (attached as well). Zubax_LibCyphal_CanTraffic.pdf candump13-2024-09-11_163217.log candump02-2024-09-11_163221.logI'm not sure whether it is correct fix or not but the following patch makes such timeouts go away: