Closed lemeura closed 11 months ago
how do you initialize your socket?
@hartkopp: What does it means when send()
time out?
@lemeura Can you share a CAN log? Is there data being sent at all ?
Hello,
my sockets are initialized like this: socket 1: IsoTPSocketConnection(interface="can0", rxid=467595842, txid=467550962, name="F", addressing_mode=1) socket2 : IsoTPSocketConnection(interface="can0", rxid=1901, txid=1869, name="D", addressing_mode=0)
Please find attached the corresponding candump in which the socket timeout occured during the reading of DiDs.
FI: The timeout does not occur systematically on the same DiD
Actually, I have to keep running an outdated kernel but it's not very safe as none of security updates can be applied.
Best regards,
Interesting. I wonder if that could be a bug in the kernel module. Can you check if there's a correlation between the payload size and the issue? I noticed that your communication fails after a payload of 13 bytes which perfectly fits 2 can frames.
I will try to check this, but I will be not so available in the upcoming week. Maybe you can raise that issue in the kernel module repo : https://github.com/hartkopp/can-isotp
I think this might be related to this bug report. It's a race condition in the kernel module that causes the send to time out if there's a second thread listening on the same socket. There's a patch for the bug here. Maybe you want to try this, but it involves recompiling the isotp module.
If that is the case, @lemeura you can always switch to the pure python implementation until the patch is officially released. Or you can recompile the module like @lumagi proposed
Hi @pylessard ,
The issue does not occur systematically on the same frame, sometimes it's failing from the first frame, sometimes it works up to the end (8 Did reading).
@lumagi , @pylessard , I will try to compile the module and I will let you know if it can fix the issue.
Thank you very much for your support.
Hi,
Unfortunately, I didn't succeed in compiling the module, I faced an error mentioning that is useless as the module is part of kernel since 5.10.
FYI, I didn't apply the patch yet because the source code from master seems to not be the one from which the patch has been done.
Please find below, the outputs:
$ git clone https://github.com/hartkopp/can-isotp.git
Cloning into 'can-isotp'...
remote: Enumerating objects: 599, done.
remote: Counting objects: 100% (162/162), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 599 (delta 111), reused 136 (delta 96), pack-reused 437
Receiving objects: 100% (599/599), 120.91 KiB | 2.69 MiB/s, done.
Resolving deltas: 100% (242/242), done.
$ git log
commit 7626d0a0707391970080d493ce69638719938da7 (HEAD -> master, origin/master, origin/HEAD)
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Fri May 20 08:16:13 2022 +0200
isotp: bump out-of-tree version date
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
$ uname -r
5.15.0-84-generic
$ make
make -C /lib/modules/5.15.0-84-generic/build M=~/Documents/can-isotp modules
make[1]: Entering directory '/usr/src/linux-headers-5.15.0-84-generic'
CC [M] ~/Documents/can-isotp/net/can/isotp.o
~/Documents/can-isotp/net/can/isotp.c:93:2: error: #error No need to compile this out-of-tree driver! ISO-TP is part of Linux Mainline kernel since Linux 5.10.
93 | #error No need to compile this out-of-tree driver! ISO-TP is part of Linux Mainline kernel since Linux 5.10.
| ^~~~~
make[3]: *** [scripts/Makefile.build:297: ~/Documents/can-isotp/net/can/isotp.o] Error 1
make[2]: *** [scripts/Makefile.build:560: ~/Documents/can-isotp/net/can] Error 2
make[1]: *** [Makefile:1909: ~/Documents/can-isotp] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.15.0-84-generic'
make: *** [Makefile:21: modules] Error 2
Could you tell me if I'm working with the correct branch ? Did I miss any steps ?
how do you initialize your socket?
@hartkopp: What does it means when
send()
time out?
Usually when there was no FC answer to a FF first frame within 1000ms.
You should just read and follow the error messages ;-)
~/Documents/can-isotp/net/can/isotp.c:93:2: error: #error No need to compile this out-of-tree driver! ISO-TP is part of Linux Mainline kernel since Linux 5.10. 93 | #error No need to compile this out-of-tree driver! ISO-TP is part of Linux Mainline kernel since Linux 5.10.
Could you tell me if I'm working with the correct branch ? Did I miss any steps ?
There is no need to clone any can-isotp repository as the feature of CAN ISOTP sockets is "just there" in your 5.15 kernel.
I think this might be related to this bug report. It's a race condition in the kernel module that causes the send to time out if there's a second thread listening on the same socket. There's a patch for the bug here. Maybe you want to try this, but it involves recompiling the isotp module.
@pylessard are you using the epoll() or poll() syscalls within udsoncan? If not, this patch and its description does not apply to the problem IMO.
@hartkopp : I'm using send() and recv(), that's all.
@hartkopp : I'm using send() and recv(), that's all.
That's weird. I will take a deeper look into the log file.
Did you ever use the CAN_ISOTP_WAIT_TX_DONE
flag from the isotp sockopts?
This flag causes a waiting for the PDU to be transmitted entirely (or returning a timeout error).
Please also consider this. I originally did a bit of digging in the CPython source code because I just assumed Python would pass the send/recv calls directly to libC. But instead, it always configures the sockets to be nonblocking and uses poll to implement blocking semantics for its sockets.
Edit: this is the main loop that does the polling (or rather select) until the socket is ready.
Ok @lumagi , than it makes sense that the issue relates to your fix!
But the CPython implementation should not act against the users expectation and exchange syscalls without notice. I assume this is used for multi-threading.
Oh, I must add that udsoncan.IsoTPSocketConnection defines a timeout for the socket by default. By calling settimeout(), it puts the socket into non-blocking mode. I'm not exactly sure if that make usage of epoll. I just read this morning what epoll was to be honest.
https://github.com/pylessard/python-udsoncan/blob/master/udsoncan/connections.py#L280 https://github.com/pylessard/python-can-isotp/blob/master/isotp/tpsock/__init__.py#L84
> I think this might be related to [this bug report](https://lore.kernel.org/linux-can/04fd32bc-b1c4-b9c3-3f8b-7987704a1f85@hartkopp.net/T/#m9615a4ecbdb742749886af73414e476498c93d51). It's a race condition in the kernel module that causes the send to time out if there's a second thread listening on the same socket. There's a [patch for the bug here](https://lore.kernel.org/linux-can/96d31e8c-fa26-4e32-4c36-768981f20a54@hartkopp.net/T/#u). Maybe you want to try this, but it involves recompiling the isotp module.
I was facing similar issue of socket timing out when sending frames over CAN when I moved to Ubuntu 20.04 and using 5.15 kernel. I saw that any kernel greater than 5.15.78 was facing the same issue
I used this patch on 5.15.133 and 6.6 rc4 and rebuilt the kernel to verify and found that the said issue is resolved but since this is patch work any chance that above mentioned change will come to mainline and will be adopted in Ubuntu (fossa release)
Thank you to @hartkopp for fast-tracking the patch. The commit that lead to the regression was introduced to the upstream kernel with release 6.3. That means Ubuntu must be backporting the patches to its LTS kernels. I would assume that they will do so as well for the patch that fixes the issue if it is accepted and released in the next upstream kernel version 6.7.
@lumagi we are currently in the 6.6-rc cycle. And when the patch hits the 6.6-rc tree it automatically gets tracked for the kernel.org LTS kernels that have the issue. So it's very likely that Ubuntu will also apply that patch to its LTS kernels.
That's good to know, thanks for the clarification @hartkopp
@lemeura : does the conversation above helps you?
Hi all,
I would like to thank you very much for all these clarifications.
The fact that the patch is in the pipe for future integration in the kernel is a very good news, as actually I'm unable to distribute custom kernel on my different setups.
I just have to be patient :) and I will prepare the migration to future Ubuntu LTS 24.04, potentially including directly a fixed kernel.
Best regards,
I will close that issue then. @lemeura, please reopen if you believe something needs to be addressed
Yes. The patch has been applied to Linus' mainline kernel tree - so it will get attention for the stable kernels soon.
Hi,
I'm currently facing a quite systematic issue when using 2 different sockets to send UDS data on the same bus.
The stack trace ends like this:
In details, the aim of my project is to simulate UDS conversation then:
The 2 sockets are opened successfully, the first one sends the request, the second one raises an exception when trying to send.
This setup was working well with old kernels but with recent ones, the timeout is very frequent but not systematic.
I tried without success:
Could you help me to identify the component which introduced this "issue"? I'm not familiar with the different layers in action for this topic:
Thanks in advance, best regards,