hartkopp / can-isotp

Linux Kernel Module for ISO 15765-2:2016 CAN transport protocol PLEASE NOTE: This module is part of the mainline Linux kernel since version 5.10
Other
248 stars 71 forks source link

unstable transmission of data: sequence number of consecutive frames get messed up #58

Closed muehlke closed 1 year ago

muehlke commented 1 year ago

Hello,

I'm using an UDS client that uses the can-isotp Linux kernel module as the transport layer to carry out a firmware update on an ECU.

After requesting a download and receiving a confirmation from the ECU I proceed to transfer blocks of 3074 bytes over CAN-ISOTP to deliver the firmware to the ECU. For this, I call write(int fd, const void *buf, size_t count) with the file descriptor obtained from calling socket(AF_CAN, SOCK_DGRAM | SOCK_NONBLOCK, CAN_ISOTP) and bound to an specified address. This transfer is performed by a development board with a Linux Gateway (to which I'm connected over SSH) to the ECU. I sniff out the whole bus traffic with the PCAN-USB adapter (https://www.peak-system.com/PCAN-USB.199.0.html?&L=1) connected to an Ubuntu VM with the candump command from can-utils.

That's when I observe that after the eleventh consecutive frame (CF) is sent, the next sequence number is not right (0x0 instead of 0xC) and further down it gets even worse; some sequence numbers get sent twice or three times. It also does not transmit all required 3074 bytes. Here the example "candumped" trace (IDs hidden):

(255.316173)  can0  18DAXXF1   [8]  1C 02 36 01 5B 47 65 6E   // start of transfer, (0xC02) is for 3074 bytes
(255.318129)  can0  18DAF1XX   [8]  30 00 00 FF FF FF FF FF   // flow control from ECU
(255.319999)  can0  18DAXXF1   [8]  21 65 72 61 6C 49 6E 66   // sending consecutive frames
(255.319999)  can0  18DAXXF1   [8]  22 6F 5D 0D 0A 54 59 50
(255.322550)  can0  18DAXXF1   [8]  23 45 20 3D 20 30 78 30
(255.324832)  can0  18DAXXF1   [8]  24 30 30 38 0D 0A 42 4F
(255.327750)  can0  18DAXXF1   [8]  25 41 52 44 20 3D 20 22
(255.327750)  can0  18DAXXF1   [8]  26 78 43 55 2D 54 48 48
(255.327751)  can0  18DAXXF1   [8]  27 22 0D 0A 5B 4D 53 57
(255.330786)  can0  18DAXXF1   [8]  28 43 6F 6E 74 65 6E 74
(255.330787)  can0  18DAXXF1   [8]  29 5D 0D 0A 3A 32 30 30
(255.333698)  can0  18DAXXF1   [8]  2A 30 30 30 30 30 37 34
(255.333698)  can0  18DAXXF1   [8]  2B 37 34 35 46 36 36 37  // all sequence numbers were alright until here
(255.335666)  can0  18DAXXF1   [8]  20 30 32 30 30 30 30 36
(255.335667)  can0  18DAXXF1   [8]  26 31 46 41 35 44 33 34
(255.335667)  can0  18DAXXF1   [8]  23 33 41 44 45 44 43 46
(255.337927)  can0  18DAXXF1   [8]  21 39 34 43 39 43 35 34  // sequence number 1 three times
(255.337928)  can0  18DAXXF1   [8]  21 30 41 30 30 30 32 32  
(255.337928)  can0  18DAXXF1   [8]  21 32 33 38 38 36 31 44
(255.340456)  can0  18DAXXF1   [8]  20 32 46 39 43 32 39 43  // sequence number 0 two times 
(255.340456)  can0  18DAXXF1   [8]  20 30 44 33 39 30 41 34
(255.342514)  can0  18DAXXF1   [8]  2F 44 33 37 44 39 41 39
(255.342514)  can0  18DAXXF1   [8]  20 45 42 32 31 45 33 42

I have come to a few ideas why this might not be working like e.g. that there is maybe interference on the CAN bus (physical reason) because I don't want to think that there might be something wrong with this kernel module. After double-checking my code I came to the realization that it all happens after the already mentioned write() call and then the kernel module takes care of the ISOTP part of the transfer.

I would highly appreciate any input as to why this might be happening - hardware or software related. Thanks in advance!

muehlke commented 1 year ago

Realized my txqueuelen was set to 10. That's why beginning from the 11th frame things got messed up. Solved by setting a bigger value with:

ip link set dev can0 txqueuelen 4096
hartkopp commented 1 year ago

Realized my txqueuelen was set to 10. That's why beginning from the 11th frame things got messed up. Solved by setting a bigger value with:

ip link set dev can0 txqueuelen 4096

Hi @muehlke ,

in Linux 5.18+ the CAN frame handling has been reworked which makes the txqueuelen tweak obsolete: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/net/can?h=linux-5.18.y&id=4b7fe92c06901f4563af0e36d25223a5ab343782

If you are able to upgrade your Linux kernel (e.g. with Ubuntu ppa packages), I would suggest to update to the latest longterm kernel Linux 6.1.x which has some more stability improvements in can-isotp.

muehlke commented 1 year ago

Thanks @hartkopp for your advice! I'll take it into consideration for my project. Thanks for the investment into this topic.