ros-industrial / ros2_canopen

CANopen driver framework for ROS2
https://ros-industrial.github.io/ros2_canopen/manual/rolling/
140 stars 60 forks source link

RPDOs assume 32 bit data #106

Closed james-ward closed 1 year ago

james-ward commented 1 year ago

Describe the bug RPDOs referring to 32 bit objects work properly (i.e. this is what is is the test suite). RPDOs referring to 16 bit objects will cause the device container node to crash.

To Reproduce Change the RPDO in the example to use 16 bit objects:

  1. Edit canopen_tests/config/simple/simple.eds a. Change line 260 to refer to 16 bits: 1=40010010 b. Change line 270 to make object 4001 a uint16: DataType=0x0006
  2. Compile, and run the example: ros2 launch canopen_tests proxy_setup.launch.py
  3. Send a CAN message to the PDO in question: cansend vcan0 182#0000
  4. Crash. Log below.

Expected behavior 16 bit object should be updated and ROS message sent to the appropriate topic.

Logs

[device_container_node-3] [INFO] [1683419857.881505138] [proxy_device_2]: Driver booted and ready.
[device_container_node-3] [INFO] [1683419857.883477874] [proxy_device_2]: Starting EMCY Listener
[device_container_node-3] [INFO] [1683419857.884723793] [proxy_device_2]: Starting RPDO Listener
[device_container_node-3] terminate called after throwing an instance of 'lely::canopen::SdoError'
[device_container_node-3]   what():  RpdoRead:02:4001:00: General error (08000000): General error
[ERROR] [device_container_node-3]: process has died [pid 243640, exit code -6, cmd '/home/james/zio/install/canopen_core/lib/canopen_core/device_container_node --ros-args -r __node:=device_container_node -r __ns:=/ --params-file /tmp/launch_params_4gbhgr61 --params-file /tmp/launch_params_7kh1ndm4 --params-file /tmp/launch_params_ozlm3wi9 --params-file /tmp/launch_params_zm6zw54t'].

Setup:

james-ward commented 1 year ago

I can confirm that it is the cast on this line that is causing the problem: https://github.com/ros-industrial/ros2_canopen/blob/9637faf108330f6262a2a23eba1af31c72217716/canopen_base_driver/src/lely_driver_bridge.cpp#L117

If this cast is changed to (uint16_t) the device container does not crash when only 16 bits are sent to the PDO. (This obviously is not the fix, but it does show that the cast is forcing the uint32_t specialisation of the template to be used in the lely can library, which then causes the crash.)

james-ward commented 1 year ago

Something like this stop the crashing, and correctly differentiates between 8, 16 and 32 bit objects, but only receives the first 4 8-bit objects.

void LelyDriverBridge::OnRpdoWrite(uint16_t idx, uint8_t subidx) noexcept
{
  uint32_t data = 0;
  auto type = CODataTypes::CODataUnkown;
  try {
    data = (uint32_t)rpdo_mapped[idx][subidx];
    type = CODataTypes::COData32;
  }
  catch (...) {
    try {
      data = (uint16_t)rpdo_mapped[idx][subidx];
      type = CODataTypes::COData16;
    }
    catch (...) {
      try {
        data = (uint8_t)rpdo_mapped[idx][subidx];
        type = CODataTypes::COData8;
      }
      catch (...) {
        return;
      }
    }
  }
  COData codata = {idx, subidx, data, type};

  //  We do not care so much about missing a message, rather push them through.
  rpdo_queue->push(codata);
}

Sample output for 8-byte messages (8x1 byte, then 2x4 bytes, then 4x2 bytes). Note the missing subindexes 5-8 for 8194:

[device_container_node-1] [INFO] [1683467168.012960538] [plc]: Slave 2: Sent PDO index 8194, subindex 1, data ab
[device_container_node-1] [INFO] [1683467168.013118824] [plc]: Slave 2: Sent PDO index 8194, subindex 2, data cd
[device_container_node-1] [INFO] [1683467168.013148225] [plc]: Slave 2: Sent PDO index 8194, subindex 3, data 12
[device_container_node-1] [INFO] [1683467168.013169274] [plc]: Slave 2: Sent PDO index 8194, subindex 4, data 34
[device_container_node-1] [INFO] [1683467205.074515317] [plc]: Slave 2: Sent PDO index 8209, subindex 1, data 3412cdab
[device_container_node-1] [INFO] [1683467205.074751102] [plc]: Slave 2: Sent PDO index 8209, subindex 2, data cdabcdab
[device_container_node-1] [INFO] [1683467217.675927517] [plc]: Slave 2: Sent PDO index 16384, subindex 1, data cdab
[device_container_node-1] [INFO] [1683467217.676128283] [plc]: Slave 2: Sent PDO index 16384, subindex 2, data 3412
[device_container_node-1] [INFO] [1683467217.676198781] [plc]: Slave 2: Sent PDO index 16384, subindex 3, data cdab
[device_container_node-1] [INFO] [1683467217.676256389] [plc]: Slave 2: Sent PDO index 16384, subindex 4, data cdab
hellantos commented 1 year ago

Hi, looks like a possible fix. One problem is that currently the proxy driver does not have access to information about object types. The 8 byte issue I have not seen yet but it seems to be an issue with lely core lib. Currently, I am away from work and don't know if I can investigate in the next 2 weeks.

james-ward commented 1 year ago

One problem is that currently the proxy driver does not have access to information about object types.

Yes, that's why I used the try-catch approach. It's not very elegant, but I couldn't think of another way that didn't involve major reworking of the underlying architecture of the proxy driver.

hellantos commented 1 year ago

There seems is a function called Type (https://lely_industries.gitlab.io/lely-core/doxygen/classlely_1_1canopen_1_1Device.html#a7cfca8df5dc8f11957a7ecf55268697f) that returns the type of an object in the dictionary. We'll need to expose it in the bridge. Then we could use it here.

hellantos commented 1 year ago

@james-ward this should be resolved now. #110 is merged.

james-ward commented 1 year ago

Brilliant. Thanks!