Open anicolle opened 1 year ago
Forgot to say that this happens with firmware flashed from canable website updater (commit 68df7d5 for canable v1, commit ba6b1dd for canable V2). But also with locally build firmware with latest repo commit (commit ac71cb412).
Everything works fine except if our application is not terminated properly (killing it with linux "kill" command or stopping the vscode python debugger) thus bypassing the canopen disconnect step. In this case the Canable seems to still be running as it is still visible on USB but no CAN messages seems to reach our device.
You mean you first start the app, kill the app, and restart the app, then it's not receiving data?
- What's your operating system?
Linux (Ubuntu)
- Can you share an python example to reproduce the problem?
I created a simple example (cf below) that connect, read an sdo and stop without disconnecting. Only the first execution works, after that the canopen library raise a "No SDO response received" exception at each execution. Disconnecting / reconnecting allow another execution to suceed.
import canopen
import usb.core
eds_path = "youredsfilepath.eds"
node_id = 1
sdo_index = 0x1018
sdo_subindex = 1
PRODUCT_ID_CANDLE_LIGHT: int = 0x606F
VENDOR_ID_CANDLE_LIGHT: int = 0x1D50
if __name__ == "__main__":
network = canopen.Network()
node = canopen.RemoteNode(node_id, eds_path)
dev = usb.core.find(
idVendor=VENDOR_ID_CANDLE_LIGHT, idProduct=PRODUCT_ID_CANDLE_LIGHT
)
network.connect(
bustype="gs_usb",
channel=dev.product,
bus=dev.bus,
address=dev.address,
bitrate=1000000,
)
network.add_node(node)
print(node.sdo[sdo_index][sdo_subindex].raw)
- Can you create and share a USB dump with wireshark?
I can't but if needed, some of my colleagues have a setup that could do that
Everything works fine except if our application is not terminated properly (killing it with linux "kill" command or stopping the vscode python debugger) thus bypassing the canopen disconnect step. In this case the Canable seems to still be running as it is still visible on USB but no CAN messages seems to reach our device.
You mean you first start the app, kill the app, and restart the app, then it's not receiving data?
Exactly !
- Can you share an python example to reproduce the problem?
I created a simple example (cf below) that connect, read an sdo and stop without disconnecting. Only the first execution works, after that the canopen library raise a "No SDO response received" exception at each execution. Disconnecting / reconnecting allow another execution to suceed.
Thanks. I don't have a canopen device here, can you create the matching other side, too?
Thanks for your help. Sadly the python canopen package doesn't support slave mode. I am testing a fork that add this feature but it is a work in progress and I am not sure I will suceed to make it work.
If you are on Linux, using the socketcan
interface not the USB one might be an option....
with socketan you can create a 'vcan0' interface, and use https://github.com/CANopenNode/CANopenLinux to create a node that listens on that interface. I only do this about once every 2 years, don't remember the finer details
My working theory is there's a problem with re-init of python-can when using the gs-usb
interface. So a vcan0
wouldn't help here.
So a
vcan0
wouldn't help here.
No indeed, that was just a suggestion to create a test setup without any actual canopen device
Thanks to both of you. I tried to isolate the problem from canopen by running the script below (inspired from Canable Getting Started page) I can run it three times without problems, but it fails at 4th attempt
import can
import usb.core
# Candlelight firmware on Linux
#bus = can.interface.Bus(bustype='socketcan', channel='can0', bitrate=500000)
PRODUCT_ID_CANDLE_LIGHT: int = 0x606F
VENDOR_ID_CANDLE_LIGHT: int = 0x1D50
dev = usb.core.find(
idVendor=VENDOR_ID_CANDLE_LIGHT, idProduct=PRODUCT_ID_CANDLE_LIGHT
)
# # Stock slcan firmware on Linux
bus = can.Bus(bustype="gs_usb",
channel=dev.product,
bus=dev.bus,
address=dev.address,
bitrate=1000000)
msg = can.Message(arbitration_id=0xc0ffee,
data=[0, 25, 0, 1, 3, 1, 4, 1],
is_extended_id=True)
try:
bus.send(msg)
print("Message sent on {}".format(bus.channel_info))
except can.CanError:
print("Message NOT sent")
I've setup your new test script, and it's running in a while ./bug.py; do sleep 1; done
loop. And while writing this it failed, not after 4, but ~40 loops.
I've setup your new test script, and it's running in a
while ./bug.py; do sleep 1; done
loop. And while writing this it failed, not after 4, but ~40 loops.
Interesting, before your comment I was thinking it looked like when you send 3 frames that fill up the hardware FIFO but there is no other device sending an ACK.
Tempted to blame python-can , but the errors you got may point to the device getting into a broken state. Thanks for confirming you are testing the latest build from master.
I would +1 on marc's request for a wireshark trace - wonder what's going on here
Every run of the test script starts with a GET DESCRIPTOR Request STRING
:
Frame 2179: 64 bytes on wire (512 bits), 64 bytes captured (512 bits) on interface usbmon1, id 0
USB URB
[Source: host]
[Destination: 1.23.0]
URB id: 0xffff9bba09d09180
URB type: URB_SUBMIT ('S')
URB transfer type: URB_CONTROL (0x02)
Endpoint: 0x80, Direction: IN
Device: 23
URB bus id: 1
Device setup request: relevant ('\0')
Data: not present ('<')
URB sec: 1683055257
URB usec: 690974
URB status: Operation now in progress (-EINPROGRESS) (-115)
URB length [bytes]: 254
Data length [bytes]: 0
[Response in: 2180]
Interval: 0
Start frame: 0
Copy of Transfer Flags: 0x00000200, Dir IN
Number of ISO descriptors: 0
Setup Data
bmRequestType: 0x80
bRequest: GET DESCRIPTOR (6)
Descriptor Index: 0x00
bDescriptorType: STRING (0x03)
Language Id: no language specified (0x0000)
wLength: 254
and is answered with a GET DESCRIPTOR Response STRING
within ~1ms:
Frame 2180: 68 bytes on wire (544 bits), 68 bytes captured (544 bits) on interface usbmon1, id 0
USB URB
[Source: 1.23.0]
[Destination: host]
URB id: 0xffff9bba09d09180
URB type: URB_COMPLETE ('C')
URB transfer type: URB_CONTROL (0x02)
Endpoint: 0x80, Direction: IN
Device: 23
URB bus id: 1
Device setup request: not relevant ('-')
Data: present ('\0')
URB sec: 1683055257
URB usec: 691097
URB status: Success (0)
URB length [bytes]: 4
Data length [bytes]: 4
[Request in: 2179]
[Time from request: 0.000123000 seconds]
Unused Setup Header
Interval: 0
Start frame: 0
Copy of Transfer Flags: 0x00000200, Dir IN
Number of ISO descriptors: 0
STRING DESCRIPTOR
bLength: 4
bDescriptorType: 0x03 (STRING)
wLANGID: English (United States) (0x0409)
in the bad case, the response times out after ~1s:
Frame 2214: 64 bytes on wire (512 bits), 64 bytes captured (512 bits) on interface usbmon1, id 0
USB URB
[Source: 1.23.0]
[Destination: host]
URB id: 0xffff9bb90d1789c0
URB type: URB_COMPLETE ('C')
URB transfer type: URB_CONTROL (0x02)
Endpoint: 0x80, Direction: IN
Device: 23
URB bus id: 1
Device setup request: not relevant ('-')
Data: present ('\0')
URB sec: 1683055262
URB usec: 253151
URB status: No such file or directory (-ENOENT) (-2)
URB length [bytes]: 0
Data length [bytes]: 0
[Request in: 2213]
[Time from request: 1.000194000 seconds]
Unused Setup Header
Interval: 0
Start frame: 0
Copy of Transfer Flags: 0x00000200, Dir IN
Number of ISO descriptors: 0
Here's a wireshark capture. I'm not sure who triggers the first 3 request/response cycles, it may be wireshark. In the beginning I played with different sleeps, after 34s the test runs with a 3s sleep between the runs. The capture ends with the timed out GET DESCRIPTOR Response STRING
.
Here a cleaner capture, again with 3s sleep between the test program calls. The candlelight is running current master firmware. It fails with the same problem.
I did also notice that running lsusb -v
in "working" (left ) and "failed"(right) states doesn't return the same results.
I can reproduce the lsusb
problem here, too.
dumb question, what dependency am I missing ?
can.exceptions.CanInterfaceNotImplementedError: Cannot import module can.interfaces.gs_usb for CAN interface 'gs_usb': No module named 'gs_usb
Switching instead to the commented line bus = can.interface.Bus(bustype='socketcan', channel='can0', bitrate=1000000)
works with no issues, so whatever python module provides 'gs_usb' is doing something unusual
pip install gs_usb
Switching instead to the commented line
bus = can.interface.Bus(bustype='socketcan', channel='can0', bitrate=1000000)
works with no issues
ACK - I suggested to use socketcan
as a workaround unter Linux, too: https://github.com/candle-usb/candleLight_fw/issues/165#issuecomment-1531411613
so whatever python module provides 'gs_usb' is doing something unusual
The gs_usb
user space driver is requesting GET DESCRIPTOR Request STRING
on every startup, so maybe some resource exhaustion on the firmware side?
gs_usb
thanks. I had only looked for it in my distro's packages.
so maybe some resource exhaustion on the firmware side?
Skimming throught the ST USB stack, I don't see how that could be possible ? there's no dynamic allocation, and the string descriptor requests are handled pretty much like all other control requests. Their handling of the "len" argument is a bit dubious at times (casting to/from *uint16) but I imagine we would be seeing more problems if it was flawed.
hi guys, did you crack the case eventually? :)
No, I currently don't have time to chase that bug.
No, I currently don't have time to chase that bug.
Same here unfortunately.
the next steps would be to hook up a debugger, reproduce the problem, then examine the device state (USB periph registers, etc), and possibly set breakpoints on the get_descriptor functions.
Hi all, First, thank you for this great repository. We are facing a weird problem with our Canable-based tools using candleLight_fw. We have built a small desktop application using python canopen that connect to our custom product over can with a Canable. Everything works fine except if our application is not terminated properly (killing it with linux "kill" command or stopping the vscode python debugger) thus bypassing the canopen disconnect step. In this case the Canable seems to still be running as it is still visible on USB but no CAN messages seems to reach our device. Canopen library debug shows : "Transfer aborted by client with code 0x05040000" Rebooting our Canable (by disconnecting / reconnecting it from USB) fix the problem. Any idea what could be the cause of this ? Any workaround we could use to avoid needing to disconnect / reconnect manually ? Thanks,