linux-can / can-utils

Linux-CAN / SocketCAN user space applications
2.39k stars 711 forks source link

Networking with CAN-J1939 support (Ubuntu Linux) ./testj1939 #159

Closed josemic closed 4 years ago

josemic commented 5 years ago

I have installed linux kernel 5.4rc4, as described in this link on Ubuntu 19.10.

I go forward as desribed in can-j1939-kickstart.md.

sudo modprobe vcan
sudo ip link add can0 type vcan
sudo ip link set can0 up
sudo modprobe can-j1939

On terminal 1: ./testj1939 -r can0: On terminal 2: cansend can0 1823ff40#0123 Result: No output on terminal 1.

On terminal 2: cansend can0 18234140#0123

Result (on terminal1): No output.

On terminal 1: ./testj1939 -r can0:0x80

On terminal 2: cansend can0 18238040#0123

Result (on terminal 1): 40 02300: 01 23 Which was expected.

On terminal 1: candump -L can0

On terminal 2: ./testj1939 -s can0:0x80,0x3ffff Result (on terminal 2): ./testj1939: sendto: Permission denied

On terminal 2: ./testj1939 -s can0:0x90,0x3ffff

Result (on terminal 2): ./testj1939: sendto: Permission denied On terminal 2: ./testj1939 -s can0:0x80,0x12345

Result (on terminal 2): ./testj1939: bind(): Invalid argument On terminal 2: ./testj1939 -s can0:,0x12300 :0x40

Result (on terminal 2): ./testj1939: sendto: File descriptor in bad state

On terminal 2: ./testj1939 -s can0:0x80,0x12300 :,0x32100 Result (on terminal 2): ./testj1939: sendto: Permission denied On terminal 2: ./testj1939 -s can0:0x80,0x12300 :0x40,0x32100 Result (on terminal 1): (1571942859.574852) can0 1B214080#0123456789ABCDEF Which was beside the timestamp expected.

On terminal 2: ./testj1939 -s20 can0:0x80 :,0x12300

Result (on terminal 2): ./testj1939: sendto: Permission denied On terminal 2: ./testj1939 -w1.0 -s20 can0:0x80 :,0x12300

Result (on terminal 2): ./testj1939: sendto: Permission denied

On terminal 3: ./testj1939 can0:0x90 -r &

Result (on terminal 3): [1] 27414

On terminal 2: ./testj1939 -s20 can0:0x80 :0x90,0x12300

Result (on terminal 3):

80 12300: 01 23 45 67 89 ab cd ef
00008     01 23 45 67 89 ab cd ef
00010     01 23 45 67

Result (on terminal 1 (candump)):

(1571943360.432011) can0 18EC9080#1014000303002301
(1571943360.432044) can0 18EC8090#110301FFFF002301
(1571943360.432057) can0 18EB9080#010123456789ABCD
(1571943360.432058) can0 18EB9080#02EF0123456789AB
(1571943360.432060) can0 18EB9080#03CDEF01234567FF
(1571943360.432077) can0 18EC8090#13000003FF002301

However, besides the timestamps expected was:

18EC9080#1014000303002301
18EC8090#110301FFFF002301
18EB9080#010123456789ABCD
18EB9080#02EF0123456789AB
18EB9080#03CDEF01234567
18EC8090#13140003FF002301

On terminal 2: ./testj1939 -s can0:0x80,0x0100

Result (on terminal 2): ./testj1939: sendto: Permission denied

On terminal 2: ./testj1939 -s -p3 can0:0x80,0x0200

Result (on terminal 2): ./testj1939: sendto: Permission denied

olerem commented 5 years ago

Hi @josemic,

thank you for testing! I can extract from you report 3 different issues:

  1. send to broadcast is currently not supported by testj1939
  2. EOMA message was send by kernel with total size set to 0
  3. your expected example has no 0xff padding at the end of payload.

So, lets do it one by one:

  1. i'll fix probably today.
  2. A fix for the kernel bug is already send to the linux-can list. https://www.spinics.net/lists/linux-can/index.html (not jet listed :/ )
  3. According to SAE J1939 spec: "Each Data Transfer packet (other than the last packet in a transmission sequence) shall include 7 bytes of the original large message. The final packet includes a data field of 8 bytes: those being the sequence number of the packet and at least 1 byte of data related to the Parameter Group and then any remaining unused bytes set to “FF 16 .” I assume there is a typo in the expectation example.
josemic commented 4 years ago

Hi @olerem,

If you are refferring with 3. to the following, then I get the same result as before (except of the timestamp): On terminal 1: candump -L can0

On terminal 2:

./testj1939 can0:0x90 -r &
[1] 15741

On terminal 3: ./testj1939 -s20 can0:0x80 :0x90,0x12300

Output on terminal 1:

(1572125047.692271) can0 18EC9080#1014000303002301
(1572125047.692305) can0 18EC8090#110301FFFF002301
(1572125047.692317) can0 18EB9080#010123456789ABCD
(1572125047.692318) can0 18EB9080#02EF0123456789AB
(1572125047.692320) can0 18EB9080#03CDEF01234567FF
(1572125047.692335) can0 18EC8090#13000003FF002301

Output on terminal 3:

80 12300: 01 23 45 67 89 ab cd ef
00008     01 23 45 67 89 ab cd ef
00010     01 23 45 67

If you are referring with 3. to something else, please let me know to what you are refferring.

olerem commented 4 years ago

Hi @josemic , i refer to your comment:

However, besides the timestamps expected was: 18EC9080#1014000303002301 18EC8090#110301FFFF002301 18EB9080#010123456789ABCD 18EB9080#02EF0123456789AB 18EB9080#03CDEF01234567

Int this line you are missing FF on the end ^

18EC8090#13140003FF002301

On this line you are spotted real bug ^

josemic commented 4 years ago

Hi @olerem,

to 1: Please let me know, when you have fixed "testj1939.c". I may test it again.

to 3:

Int this line you are missing FF on the end ^

This was not my copy and paste error. This comes directly from "can-j1939-kickstart.md". Please note that this occurs twice in "can-j1939-kickstart.md". Could you please patch "can-j1939-kickstart.md"?

marckleinebudde commented 4 years ago

@josemic Feel free to send a patch for the kickstart yourself. I think you can edit it directly via the github web interface.

josemic commented 4 years ago

@marckleinebudde Done. Yes. Editing via github web interface is stupid simple! Thanks for the hint.

josemic commented 4 years ago

I am trying the address claiming daemon jacd and I get the following result:

./jacd 3102032E003D0080
jacd: bindtodevice can0: Operation not permitted

I have CAN initilized with:

sudo modprobe vcan
sudo ip link add can0 type vcan
sudo ip link set can0 up
sudo modprobe can-j1939

What am I doing wrong?

josemic commented 4 years ago

I have upgraded to kernel 5.4rc5. Still the same errors.

Jacd also fails:

jacd -r 100,80-120 -v -c /tmp/1122334455667788.jacd 1122334455667788
jacd: ready for can0:1122334455667788
- socket(PF_CAN, SOCK_DGRAM, CAN_J1939);
- setsockopt(, SOL_SOCKET, SO_BINDTODEVICE, can0, 4);
jacd: bindtodevice can0: Operation not permitted
josemic commented 4 years ago

I have upgraded to kernel 5.4rc6. Still the same errors.

olerem commented 4 years ago

Works for me:

root@DistroKit:~/tests/j1939 jacd -r 100,80-120 -v -c /tmp/1122334455667788.jacd
 1122334455667788
jacd: ready for can0:1122334455667788
- socket(PF_CAN, SOCK_DGRAM, CAN_J1939);
- setsockopt(, SOL_SOCKET, SO_BINDTODEVICE, can0, 4);
- setsockopt(, SOL_CAN_J1939, SO_J1939_FILTER, <filter>, 96);
- setsockopt(, SOL_SOCKET, SO_BROADCAST, 1, 4);
- bind(, can0:1122334455667788, 24);
- socket(PF_CAN, SOCK_DGRAM, CAN_J1939);
- setsockopt(, SOL_SOCKET, SO_BINDTODEVICE, can0, 4);
- setsockopt(, SOL_CAN_J1939, SO_J1939_FILTER, <filter>, 96);
- setsockopt(, SOL_SOCKET, SO_BROADCAST, 1, 4);
- bind(, can0:1122334455667788, 24);
- sendto(, { 0, 0xee, 0, }, 3, 0, -,0ea00, 24);
jacd: request sent, pending for 1250 ms

- bind(, can0:1122334455667788, 24);
- send(, 1234605616436508552, 8, 0);
jacd: claimed 0x50

Do you actually have can0 interface? is it configured?

josemic commented 4 years ago

Hi @olerem,

I have configured can0 as following, is this sufficient?

sudo modprobe vcan
sudo ip link add can0 type vcan
sudo ip link set can0 up
sudo modprobe can-j1939

I have updated can-utils/jacd to the latest version.

./jacd -r 100,80-120 -c /tmp/1122334455667788.jacd 1122334455667788
jacd: bindtodevice can0: Operation not permitted

and with -v option:

./jacd -r 100,80-120 -v -c /tmp/1122334455667788.jacd 1122334455667788
jacd: ready for can0:1122334455667788: No such file or directory
olerem commented 4 years ago

Hm.. i reproduced it after switching to not root mode. Can you please try with sudo.

@marckleinebudde, we need to think about permissions

josemic commented 4 years ago

@olerem: With sudo and only latest can-utils it works without -v option.

sudo ./jacd -r 100,80-120 -c /tmp/1122334455667780.jacd 1122334455667780

Result is:

candump -L can0

(1573379833.419543) can0 18EAFFFE#00EE00
(1573379834.669767) can0 18EEFF50#8077665544332211

and with -v option:

sudo ./jacd -r 100,80-120 -v -c /tmp/1122334455667781.jacd 1122334455667781
jacd: ready for can0:1122334455667781: No such file or directory

and with -v option and previous temp file:

sudo ./jacd -r 100,80-120 -v -c /tmp/1122334455667780.jacd 1122334455667781
jacd: ready for can0:1122334455667781: Success

and strangely the prompt returns and candump output is empty: candump -L can0 however without -v option:

sudo ./jacd -r 100,80-120 -c /tmp/1122334455667780.jacd 1122334455667781

prompt does not return and candump output is:

candump -L can0
(1573380419.254651) can0 18EAFFFE#00EE00
(1573380420.504910) can0 18EEFF50#8177665544332211
olerem commented 4 years ago

@josemic , sorry, i can't confirm it:

ore@DistroKit:/root/tests/j1939 sudo candump -t d vcan0 &
ore@DistroKit:/root/tests/j1939 sudo jacd -r 100,80-120 -c /tmp/1122334455667780.jacd 1122334455667780 vcan0 &
ore@DistroKit:/root/tests/j1939  (000.000000)  vcan0  18EAFFFE   [3]  00 EE 00
 (001.251263)  vcan0  18EEFF50   [8]  80 77 66 55 44 33 22 11

ore@DistroKit:/root/tests/j1939 
ore@DistroKit:/root/tests/j1939 sudo jacd -r 100,80-120 -v -c /tmp/1122334455667
781.jacd 1122334455667781 vcan0
jacd: ready for vcan0:1122334455667781
- socket(PF_CAN, SOCK_DGRAM, CAN_J1939);
- setsockopt(, SOL_SOCKET, SO_BINDTODEVICE, vcan0, 5);
- setsockopt(, SOL_CAN_J1939, SO_J1939_FILTER, <filter>, 96);
- setsockopt(, SOL_SOCKET, SO_BROADCAST, 1, 4);
- bind(, vcan0:1122334455667781, 24);
- socket(PF_CAN, SOCK_DGRAM, CAN_J1939);
- setsockopt(, SOL_SOCKET, SO_BINDTODEVICE, vcan0, 5);
- setsockopt(, SOL_CAN_J1939, SO_J1939_FILTER, <filter>, 96);
- setsockopt(, SOL_SOCKET, SO_BROADCAST, 1, 4);
- bind(, vcan0:1122334455667781, 24);
- sendto(, { 0, 0xee, 0, }, 3, 0, -,0ea00, 24);
 (024.153737)  vcan0  18EAFFFE   [3]  00 EE 00
 (000.000381)  vcan0  18EEFF50   [8]  80 77 66 55 44 33 22 11
jacd: request sent, pending for 1250 ms
- bind(, vcan0:1122334455667781, 24);
- send(, 1234605616436508545, 8, 0);
 (001.250995)  vcan0  18EEFF51   [8]  81 77 66 55 44 33 22 11
jacd: claimed 0x51
josemic commented 4 years ago

@olerem: Now it looks a little bit different (note: I changed vcan0 to can0):

On terminal 1: sudo candump -t d can0

On terminal 2:

sudo jacd -r 100,80-120 -c /tmp/1122334455667780.jacd 1122334455667780 can0 
jacd: send request for address claims: Permission denied

No output on terminal 1.

Looks still like a permission problem. Could it be that the signal handler in jacd is running without su rights?

josemic commented 4 years ago

@olerem: I have deinstalled the ubuntu can-utils package with "sudo apt remove can-utils". Now when using the latest can-utils from git it seems to work better. Still I have 2 findings:

  1. when the command is called with an & at the end of the line, the command does not work, as the password is not queried. When removing the "&", the password is queried.
  2. when using the jacd -v option the command mostly fails with the message "No such file or directory"
    sudo ./jacd -r 100,80-120 -v -c /tmp/1122334455667781.jacd 1122334455667781 can0
    jacd: ready for can0:1122334455667781: No such file or directory

    Sometimes also "success" is printed, but noting else and no candump output is generated. Also the prompt returns immediately:

sudo ./jacd -r 100,80-120 -v -c /tmp/1122334455667780.jacd 1122334455667780 can0
jacd: ready for can0:1122334455667780: Success

Terminal 1:

sudo ./candump -t d can0
 (012.914322)  can0  18EAFFFE   [3]  00 EE 00
 (000.000000)  can0  18EAFFFE   [3]  00 EE 00
 (001.250182)  can0  18EEFF51   [8]  81 77 66 55 44 33 22 11
 (001.250182)  can0  18EEFF51   [8]  81 77 66 55 44 33 22 11
 (023.529740)  can0  18EAFFFE   [3]  00 EE 00
 (023.529740)  can0  18EAFFFE   [3]  00 EE 00
 (000.000126)  can0  18EEFF51   [8]  81 77 66 55 44 33 22 11
 (000.000126)  can0  18EEFF51   [8]  81 77 66 55 44 33 22 11
 (001.250293)  can0  18EEFF50   [8]  80 77 66 55 44 33 22 11
 (001.250293)  can0  18EEFF50   [8]  80 77 66 55 44 33 22 11

Somehow the address claim seems to be doubled.

Terminal 2: sudo ./jacd -r 100,80-120 -c /tmp/1122334455667781.jacd 1122334455667781 can0

Terminal 3: sudo ./jacd -r 100,80-120 -c /tmp/1122334455667780.jacd 1122334455667780 can0

josemic commented 4 years ago

I have upgraded to kernel 5.4rc8. Still the same errors.

@olerem: Furthermore I have noticed that networking/j1939.rst referes to J1939_PGN_ADDRESS_REQUEST, however "j1939.h" defines only:

define J1939_PGN_REQUEST 0x0ea00

Thus this patch is missing for j1939.h: Link to patch

and networking/j1939.rst needs to be changed as following:

const struct j1939_filter filt[] = {
        {
                .pgn = J1939_PGN_ADDRESS_CLAIMED,
                .pgn_mask = J1939_PGN_PDU1_MAX,
        }, {
-                .pgn = J1939_PGN_ADDRESS_REQUEST, /* remove this line*/
+                .pgn = J1939_PGN_REQUEST,         /* add this line */
                .pgn_mask = J1939_PGN_PDU1_MAX,
        }, {
                .pgn = J1939_PGN_ADDRESS_COMMANDED,
                .pgn_mask = J1939_PGN_MAX,
        },
};
josemic commented 4 years ago

I have upgraded to kernel 5.4. Still the same errors.

olerem commented 4 years ago

@josemic , we have limited resources to fix all found bugs. For now I was able to spend a fix to respond on critical issues reported by google syzbot. Now I need to work on other projects and will be able to spends some time once a week. It is not match, but if you provide effective bug reports, I'll be able to handle them.

Please, separate different issues to different bug reports. I already lost a track of what is actually not working for you. If I answer you that I'm not able to reproduce it, then it is really the case. And I will need a help to reproduce it. Doing regular status report of kernel x.y.z will not help. Only in case I or some one else will write here, this bug was solved by patch X, then it will be time to test it.

If you have time to fix your issue, please do it. It is opensource project and it works only if all who use it participate in fixing bugs and extend functionality.

josemic commented 4 years ago

I have upgraded to kernel 5.4. Still the same errors.

This refers to usage of testj1939.c and especially to you following message.

Hi @josemic,

thank you for testing! I can extract from you report 3 different issues:

1. send to broadcast is currently not supported by testj1939

2. EOMA message was send by kernel with total size set to 0

3. your expected example has no 0xff padding at the end of payload.

So, lets do it one by one:

1. i'll fix probably today.

2. A fix for the kernel bug is already send to the linux-can list. https://www.spinics.net/lists/linux-can/index.html (not jet listed :/ )

3. According to SAE J1939 spec:
   "Each Data Transfer packet (other than the last packet in a transmission sequence) shall include 7 bytes of the original large message. The final packet includes a data field of 8 bytes: those being the sequence number of the packet and at least 1 byte of data related to the Parameter Group and then any remaining unused bytes set to “FF 16 .”
   I assume there is a typo in the expectation example.

I have upgraded to kernel 5.4. Still the same errors: 1., 2. & 3. I expected that you have fixed 1.

Let this bug #159 refer only to the above testj1939.c issues, I have for the other issue created the bugs #173 and #174.

olerem commented 4 years ago

Hi @josemic , this pull request should address some of reported issues: https://github.com/linux-can/can-utils/pull/175

josemic commented 4 years ago

Hi @olerem, I have made a minor editorial comment on the pull request. I have tested olerem:testj1938 and it works as expected on kernel 5.4 ubuntu.

josemic commented 4 years ago

When testing the priorities, I noticed, that for the priorities 0 and 1, the message occurs, that setting the priority is not permitted:

./testj1939 -B -s -p3 vcan0:0x80 :,0x0200

Works ok.

./testj1939 -B -s -p1 vcan0:0x80 :,0x0200
testj1939: set priority 1: Operation not permitted
./testj1939 -B -s -p0 vcan0:0x80 :,0x0200
testj1939: set priority 0: Operation not permitted

I am not aware that the priorities 0 & 1 are not permitted.

marckleinebudde commented 4 years ago

I am not aware that the priorities 0 & 1 are not permitted.

Priorities 0 and 1 are only permitted for root or users with capable(CAP_NET_ADMIN), see:

https://elixir.bootlin.com/linux/latest/source/net/can/j1939/socket.c#L705

I think this limit is arbitrary and we can discuss if this makes sense at all or add this to the documentation.

josemic commented 4 years ago

@marckleinebudde : At least in the ISOBUS standard I am not aware the the priorities 0 an 1 require special handling compared to the other priorities 2-7. I am not aware, if this is the case for the other J1939 based standards. However if this is not the case, I recommend to remove the root requirements and capabilities requirement CAP_NET_ADMIN for the priorities 0 and 1. As I am not familiar with the usage of capabilities, could you please explain how to set the capabilities CAP_NET_ADMIN for the command line or any userspace program.

marckleinebudde commented 4 years ago

https://unix.stackexchange.com/questions/389879/how-to-set-capabilities-with-setcap-command

marckleinebudde commented 4 years ago

On github a question[1] about j1939 priorities popped up:

On 12/2/19 11:53 PM, josemic wrote:

At least in the ISOBUS standard I am not aware the the priorities 0 an 1 require special handling compared to the other priorities 2-7. I am not aware, if this is the case for the other J1939 based standards. However if this is not the case, I recommend to remove the root requirements and capabilities requirement CAP_NET_ADMIN for the priorities 0 and 1. As I am not familiar with the usage of capabilities, could you please explain how to set the capabilities CAP_NET_ADMIN for the command line or any userspace program. [1] https://github.com/linux-can/can-utils/issues/159#issuecomment-560816311

Marc

josemic commented 4 years ago

I have retested with Ubuntu 20.04, the procedures described in can-j1939-kickstart.md. They are working now. Tested with the following kernel:

uname -a
Linux michael-ThinkPad-T430 5.4.0-33-lowlatency #37-Ubuntu SMP PREEMPT Thu May 21 14:16:07 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
josemic commented 4 years ago

I have created a new issue for the priorities root requirements with #217. As this was the last topic with this issue #159, we can close #159.