hartkopp / can-isotp

Linux Kernel Module for ISO 15765-2:2016 CAN transport protocol PLEASE NOTE: This module is part of the mainline Linux kernel since version 5.10
Other
245 stars 70 forks source link

CAN frames lost when stmin > 0 #12

Closed pylessard closed 5 years ago

pylessard commented 5 years ago

When specifying a value for stmin, some CAN frame seems to be lost. candump shows missing frames. Reading with a CAN_RAW socket shows the same behaviour. Strangely, isotprecv has a complete payload.

I ran : isotprecv -s 701 -d 700 -m 64 vcan0 then python3 -c "print(' '.join(['%02x' % x for x in range(100)]))" | isotpsend -s 700 -d 701 vcan0

image

This not only happen with isotpsend and isotprecv. I can reproduce the same behaviour by sending through a socket in Python.

If I set StMin = 0. I get this good behaviour

image

hartkopp commented 5 years ago

Hi, I tested your setup (including the python command line) with isotprev with either '-m 64' and without '-m' option (meaning STmin = 0) and I got this

screenshot_2018-12-15_18-28-40

I'm running the latest Linux 4.20.0-rc6-00161 with latest isotp.c.

What kind of machine are you running on? Can you send a candump -td dump?

pylessard commented 5 years ago

Running on a virtualized ubuntu 16.04 using VMware Workstation V11.0 uname -a gives Linux ubuntu 4.15.0-33-generic #36~16.04.1-Ubuntu SMP Wed Aug 15 17:21:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

candump with timings gives: image

pylessard commented 5 years ago

Strangley enough. I've compiled the module with debug enabled and the module started to behave properly. I then recompiled without debug (by #undef DBG...) and the issue came back.

Kind of look like the call to printk has some sort of effect on timings

In Debug mode, I got this in my kernel log

image

hartkopp commented 5 years ago

This is really strange! No idea why it misses every second frame on a virtual CAN that not even has a tx queue that might be too small. Can you please send the output of <ip -det link show vcan0> ?

I've simplified the kernel version dependent stuff and also added the reserved skb space and some debug info in a branch for Linux 4.17+ https://github.com/hartkopp/can-isotp/tree/4.17+

As you are currently using a 4.15 kernel I created an extra branch for you: https://github.com/hartkopp/can-isotp/tree/4.15py

Can you please try to test this version on your box (without debug output)? Thanks, Oliver

pylessard commented 5 years ago

Output of ip -det link show vcan0 is

3: vcan0: <NOARP,UP,LOWER_UP> mtu 72 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/can promiscuity 0 vcan

I tried your branch 4.15py and made sure that debug was disabled. Seems to have fixed the issue. I did have to add an include to linux/can/skb.h otherwise I couldn't compile. Compiler complains that struct can_skb_priv is undefined.

I don't really grasp the intent behind your changes, can you briefly explains what you did?

hartkopp commented 5 years ago

Ugh - missed the include in https://github.com/hartkopp/can-isotp/commit/f9fcd4593ce1de041e4a8b447406623aa517639e then, sorry.

The reason behind the skb reserved space can be found in the comments here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/include/linux/can/skb.h

I simply missed to add the skb reserved space handling in the isotp implementation - and additionally the issues you had on your machine never showed up on my setups. Pretty weird.

Will fix the master branch soon. @hfn92 had the same problem here https://github.com/linux-can/can-utils/issues/113 these days - so I wonder why this didn't show up earlier ...

I'll put both of you in the Reported-by: credits then. Thanks!

pylessard commented 5 years ago

Thanks, On my side, I really never used an IsoTp socket against a raw socket until last week. So it would be possible that the issue existed for a long time and I didn't notice

hartkopp commented 5 years ago

Yes. The effect occurs, when you access the local 'echoed' CAN frames, which are modified in the reception path. Therefore a connection over 'real' CAN interfaces to other real devices are not affected as the outgoing skb is handled by the CAN driver correctly. So it should only be visible when dumping the outgoing (echo'ed) traffic or on virtual CAN interfaces - which echo'es the CAN frame too. No idea why I didn't see it on my machine ...

hartkopp commented 5 years ago

Closed due to applied fix https://github.com/hartkopp/can-isotp/commit/3852256c8163b5e8c4a8d0847d47aa8ea74eb46c