Closed fred314159265 closed 5 months ago
I'll need to looking into recovery mechanisms currently implemented in the code. To be honest, my testing has only been happy-path up to this point. Usually "No buffer space..." is due to the lack of echo frames being returned via USB. I've seen this in the past where one of the queues is blocked and prevents messages in/out.
I will attempt generating a similar error mode with a debugger attached to try and root cause.
Thanks for the reply 😀
I tried tweaking a couple of basic things blindly but without a debugger I was mostly stabbing in the dark and didn't get anywhere.
BTW if you want to create the 2.5ms pulse you can just set the TX pin of a CAN transceiver high, they usually have a internal dominant timeout which will get you the few ms long pulse - you don't need to generate a short pulse with firmware or anything like that
Thanks again, and good luck!
FYI - for debugging - you can purchase an ST-LINK v3 mini for around $12USD. I think the canablev2 has the debug pins exposed.
Good point, I have had a look and I do have an old STlinkV2 I think should still work... 🤞
I was going to ask if you had any pointers on setting up debugging on the software side, but I noticed you have already tracked some VSCode config files, so I will see how far I get using them 😁
If you are using st-link and have those drivers installed (believe they come with the STM32 IDE) the only additional step is to add the cortex-debug vscode extension.
I now tried the exact commands you sent. I'm not sure if it's the HW/FW or something with the linux driver. As soon as I short the bus I see the same errors and when I try TX'ing messages from the other tool I am using I get error frames. Seems that shorting the bus is killing the CAN driver somewhere. I'll continue to dig.
OK - I think I found it. The FDCAN handler in STM32 does not have an auto recovery mechanism. It's up to the developer to handle resetting after a bus off event. I added some test code as detailed on this post: https://community.st.com/t5/stm32-mcus-products/stm32h7-fdcan-has-lost-the-automatic-bus-off-recovery-mechanism/td-p/187400
The RX'ing seems resolved but I'm still seeing "No buffer space available" and am seeing the "no such device".
I found the error: I was not writing the channel number to the error response. I created the temp variable for building the frame without writing in the channel, so, like I've yelled at developers in the past I returned whatever was on the stack.
Will create a bug and push to mainline.
Great work on finding this issue!
You're a legend; thank you very much for the fix! 😁
Let me know if it solves the issue you've been having and if you find anything else!
First of all, thank you to all those have contributed to this project, I am a fan!
I am using the canablev2 firmware build of this project and while it initially appears to work great, I am seeing that when there is a long (~2.5ms) dominant pulse on the bus from an erroneous node, the firmware appears to be acting a bit strange.
Steps
Plug in CANable V2 interface with budgetcan_fw loaded.
Bring up interface:
sudo ip link set dev can0 type can bitrate 500000 fd on dbitrate 2000000
sudo ip link set dev can0 up
Check interface status with:
ip -details link show can0
Send frame repeatedly on bus:
watch -n 0.1 'cansend can0 123#0011223344556677'
I confirm frames being send correctly with logic analyser on bus. (I am using another canablev2 to ACK the frames.)
I use a erroneous node to produce a 2.5ms dominant pulse on the bus.
Immediately after that the canablev2 fails to send any more frames, but the state show with
ip -details link show can0
remains asstate ERROR-ACTIVE
.Checking the interface status with
ip -details link show can0
shows no difference the above snippet.If I reset the interface using these commands, I get the error after the last one:
RTNETLINK answers: No such device
. I have confirmed that running these command before the issue has been seen works correctly - there is no error and the interface appears to work fine.sudo ip link set dev can0 down
sudo ip link set dev can0 type can bitrate 500000 fd on dbitrate 2000000
sudo ip link set dev can0 up
I have also tried resetting the USB device in software using what I believe us the same as
USBDEVFS_RESET
, it sends a further 10 frames before then failing to send anymore. After a further 10 cansend requests without any frames being sent, the command returnswrite: No buffer space available
.If I reset the device by unplugging and re-plugging in, then the issue is fixed after bringing back up the interface, etc. (But comes back after the next erroneous 2.5ms pulse.)