Closed muehlke closed 6 months ago
Hi @muehlke , thanks for your detailed request! Are you working with the latest mainline kernel module or with the code from this GitHub repo? The Linux kernel implementation is the lead development and this repo is intended for really old kernels. And there is also a branch for 'more recent' kernels before 5.10 here: https://github.com/hartkopp/can-isotp/tree/mainline-5.4%2B
Second question: Did you try the CAN_ISOTP_WAIT_TX_DONE
option?
Hi @hartkopp ,
I'm not working with the latest mainline kernel module, but also not with the code from this Github repo. I'm using the mainline kernel module from 5.15.149-mainline
, to be more specific at commit 458ce51d0356ee60c93f9f807d9827cf2a41643d
from the Linux Kernel - stable repo.
I was definitely on the wrong "debug" track while, as you mentioned, looking through the code of this "old" repo. After checking the source code, from which our ISOTP kernel module is built and testing out the option CAN_ISOTP_WAIT_TX_DONE
, I realized that this was the mistake, I was not setting this option so there was no feedback on whether the bytes were sent or the Flow Control frame did not arrive in time.
Thank you so much for pointing this out! I will consider the problem as understood and therefore the issue as closed.
While using an UDS client - referred to as client - compiled from the library iso14229 for an embedded linux system I ran into the following situation:
After sending an UDS request to an UDS server - referred to as server - that would generate some workload, thus, stopping the CAN/ISOTP communication until end of calculation, another request is sent. This request is sent as a multiframe packet.
Due to the workload, the server is not able to send the first FC frame within the specified
N_Bs timer
(see Table 22 in ISO 15765-2:2016), which is 1 second long. This is set in:https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L999
Because of this timeout, which is a known error to the kernel module, the data transfer is actually not successful. Nevertheless, on the client side, the
write(fd, buf, count)
call on the ISOTP socket file descriptor returns the correct number of bytes (>0) written AS IF the data was successfully transferred. This error should be propagated to upper layers so that on the UDS layer a client can get the feedback that a timeout on the transport layer occurred.After checking that the payload has to be segmented and the First Frame is created and sent, the kernel module "waits for complete transmission of current pdu" at:
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L1024
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L784
and before waking up from the waiting there ist just a debug message and the tx state of the socket is set to IDLE:
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L791
I think before "waking up"
sk->sk_err
should be set to an error like for exampleETIME
like it is set in other cases:e.g.
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L1265
or
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L1512
so that there is some feedback to the user on the error. Otherwise, after "waking up", the kernel module cannot recognize this timeout error when making this check:
https://github.com/hartkopp/can-isotp/blob/7626d0a0707391970080d493ce69638719938da7/net/can/isotp.c#L1026
and returns
size
as if the bytes were correctly written.Request: While ignoring the case that the UDS server should not stop the CAN/ISOTP communication because of some workload - which could e.g. be delegated to a worker thread - I would like to suggest that the kernel module propagates this error, since it CAN be that in an application, an ISOTP receiver does not answer with a FC frame to a multiframe transmission within the right timeout.
What do you think @hartkopp ? The developers which use the open source UDS library iso14229 in combination with the ISOTP kernel module would benefit from this improvement, because an UDS client would be able to get feedback on this transport layer error. This way, a client will not be waiting for a response that will never come since the request was not sent - but it looks like it was sent.