micro-ROS / micro_ros_espidf_component

micro-ROS ESP32 IDF component and sample code
Apache License 2.0
275 stars 64 forks source link

BUG: addtwoints_server example, corrupt request data and no response #266

Closed shvass closed 1 month ago

shvass commented 1 month ago

Issue template

Steps to reproduce the issue

esp successfully discovers the agent and registers the services. On calling the service

 ros2 service call /addtwoints example_interfaces/srv/AddTwoInts "{a: 0, b: 0}"

Expected behavior

Actual behavior

The service callback prints

Service request value: 1062829683 + 1.

Additional information

I (26) boot: ESP-IDF v5.2.1-dirty 2nd stage bootloader
I (27) boot: compile time Oct 13 2024 10:27:44
I (27) boot: Multicore bootloader
I (30) boot: chip revision: v0.2
I (34) boot.esp32s3: Boot SPI Speed : 80MHz
I (39) boot.esp32s3: SPI Mode       : DIO
I (44) boot.esp32s3: SPI Flash Size : 2MB
I (48) boot: Enabling RNG early entropy source...
I (54) boot: Partition Table:
I (57) boot: ## Label            Usage          Type ST Offset   Length
I (65) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (72) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (79) boot:  2 factory          factory app      00 00 00010000 00100000
I (87) boot: End of partition table
I (91) esp_image: segment 0: paddr=00010020 vaddr=3c090020 size=23c50h (146512) map
I (126) esp_image: segment 1: paddr=00033c78 vaddr=3fc97e00 size=047d0h ( 18384) load
I (130) esp_image: segment 2: paddr=00038450 vaddr=40374000 size=07bc8h ( 31688) load
I (139) esp_image: segment 3: paddr=00040020 vaddr=42000020 size=8fe5ch (589404) map
I (246) esp_image: segment 4: paddr=000cfe84 vaddr=4037bbc8 size=0c130h ( 49456) load
I (265) boot: Loaded app from partition at offset 0x10000
I (266) boot: Disabling RNG early entropy source...
I (277) cpu_start: Multicore app
I (287) cpu_start: Pro cpu start user code
I (287) cpu_start: cpu freq: 160000000 Hz
I (287) cpu_start: Application information:
I (290) cpu_start: Project name:     test
I (294) cpu_start: App version:      1
I (299) cpu_start: Compile time:     Oct 13 2024 10:26:39
I (305) cpu_start: ELF file SHA256:  88280305b...
I (310) cpu_start: ESP-IDF:          v5.2.1-dirty
I (316) cpu_start: Min chip rev:     v0.0
I (320) cpu_start: Max chip rev:     v0.99 
I (325) cpu_start: Chip rev:         v0.2
I (330) heap_init: Initializing. RAM available for dynamic allocation:
I (337) heap_init: At 3FCA2D18 len 000469F8 (282 KiB): RAM
I (343) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (349) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (355) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
I (363) spi_flash: detected chip: gd
I (366) spi_flash: flash io: dio
W (370) spi_flash: Detected size(8192k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (383) sleep: Configure to isolate all GPIO pins in sleep state
I (390) sleep: Enable automatic switching of GPIO sleep configuration
I (397) main_task: Started on CPU0
I (407) main_task: Calling app_main()
I (437) wifi_station_netif: ESP_WIFI_MODE_STA
I (437) pp: pp rom version: e7ae62f
I (437) net80211: net80211 rom version: e7ae62f
I (447) wifi:wifi driver task: 3fcad034, prio:23, stack:6656, core=0
I (457) wifi:wifi firmware version: a9f5b59
I (457) wifi:wifi certification version: v7.0
I (457) wifi:config NVS flash: enabled
I (457) wifi:config nano formating: disabled
I (467) wifi:Init data frame dynamic rx buffer num: 32
I (467) wifi:Init static rx mgmt buffer num: 5
I (467) wifi:Init management short buffer num: 32
I (477) wifi:Init dynamic tx buffer num: 32
I (477) wifi:Init static tx FG buffer num: 2
I (487) wifi:Init static rx buffer size: 1600
I (487) wifi:Init static rx buffer num: 10
I (487) wifi:Init dynamic rx buffer num: 32
I (497) wifi_init: rx ba win: 6
I (497) wifi_init: tcpip mbox: 32
I (507) wifi_init: udp mbox: 6
I (507) wifi_init: tcp mbox: 6
I (507) wifi_init: tcp tx win: 5760
I (517) wifi_init: tcp rx win: 5760
I (517) wifi_init: tcp mss: 1440
I (527) wifi_init: WiFi IRAM OP enabled
I (527) wifi_init: WiFi RX IRAM OP enabled
I (537) phy_init: phy_version 640,cd64a1a,Jan 24 2024,17:28:12
I (577) wifi:mode : sta (74:4d:bd:ab:a5:00)
I (577) wifi:enable tsf
I (587) wifi_station_netif: wifi_init_sta finished.
I (677) wifi:new:<11,2>, old:<1,0>, ap:<255,255>, sta:<11,2>, prof:1
I (677) wifi:state: init -> auth (b0)
I (677) wifi:state: auth -> assoc (0)
I (687) wifi:state: assoc -> run (10)
I (797) wifi:connected with AAP, aid = 3, channel 11, 40D, bssid = c4:71:54:56:f3:88
I (807) wifi:security: WPA2-PSK, phy: bgn, rssi: -64
I (807) wifi:pm start, type: 1

I (807) wifi:dp: 1, bi: 102400, li: 3, scale listen interval from 307200 us to 307200 us
I (817) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 25000, mt_pti: 0, mt_time: 10000
I (847) wifi:<ba-add>idx:0 (ifx:0, c4:71:54:56:f3:88), tid:0, ssn:3, winSize:64
I (977) wifi:AP's beacon interval = 102400 us, DTIM period = 1
I (2327) esp_netif_handlers: sta ip: 10.0.0.104, mask: 255.255.0.0, gw: 10.0.0.3
I (2327) wifi_station_netif: got ip:10.0.0.104
I (2327) wifi_station_netif: connected to ap SSID: HIDDEN password: HIDDEN
I (2337) main_task: Returned from app_main()
Service request value: 1062829683 + 1.
Service request value: 1539074169 + 1.
Service request value: 1507609954 + 1.

Discovered this issue while working on another project with parameter server where underlying services failed.

shvass commented 1 month ago

After debugging for a while, it was discovered that during de-serialization here, there is a 16 bytes of offset required to gather correct data of the message. (The first 16 bytes being some random gibberish, I've no idea about ).

My crude attempt to add a 16 bytes offset during ucdr_init_buffer worked, and data corruption was fixed.

No idea on what a proper fix would be.

pablogs9 commented 1 month ago

Which ROS 2 distro are you using in the computer side?

shvass commented 1 month ago

I've installed ros2-humble on the system. The micro-ros agent is running in a docker container build for humble.

pablogs9 commented 1 month ago

Are you using Fast DDS as middleware in the ROS 2 side?

shvass commented 1 month ago

No, I'm on eclipse-cyclonedds.

pablogs9 commented 1 month ago

Cyclone DDS is not compatible with micro-ROS. Please switch to Fast DDS.

shvass commented 1 month ago

Thank you. It fixed the issue.

Somehow it never affected topics, they were working just fine. Only services were affected.

pablogs9 commented 1 month ago

Hello @shvass, it is a well-known issue that services do not interoperate between Cyclone DDS and Fast DDS even in its ROS 2 versions without micro-ROS. So this issue in micro-ROS is inherited from its ROS 2 "parent" distros. That's why topics are not affected.

pablogs9 commented 1 month ago

For reference: https://github.com/ros2/rmw_cyclonedds/issues/184