Open eden-desta opened 3 years ago
Is possible to check in which line is this happening?
It looks like it is failing:
RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator));
At this point in the main.
I ran it in chunks Essentially ensuring each line didnt cause the stack dump. This is the location of failure.
Which board are you using? Does micro-ROS outputs something in the used serial port?
Are you doing this configuration for your board?
kconfig-tweak --enable CONFIG_MICROROSLIB
kconfig-tweak --enable CONFIG_SERIAL_TERMIOS
kconfig-tweak --enable CONFIG_STM32L4_UART4
kconfig-tweak --enable CONFIG_STM32L4_UART4_SERIALDRIVER
kconfig-tweak --enable CONFIG_UART4_SERIALDRIVER
kconfig-tweak --set-val CONFIG_UART4_RXBUFSIZE 256
kconfig-tweak --set-val CONFIG_UART4_TXBUFSIZE 256
kconfig-tweak --set-val CONFIG_UART4_BAUD 115200
kconfig-tweak --set-val CONFIG_UART4_BITS 8
kconfig-tweak --set-val CONFIG_UART4_PARITY 0
kconfig-tweak --set-val CONFIG_UART4_2STOP 0
so i do have all those configurations @pablogs9 the only difference is i replaced L4
with H7
and replaced UART4
with USART3
.
And what should I be replacing the /dev/ttyS1
with? My STM board is currently registered as dev/ttyACM0
ls /dev
produced ttyS0
so I replaced it with that and it looks like it is getting past the initial fail point. Rebuilding and rerunning to make sure everything is good!
WIll run the agent as well sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
And see if any errors occur! Thank you @pablogs9
@pablogs9 Is it a problem that I am flashing with usart3
and also communicating over it as well for the agent?
Because this output that you see
NuttShell (NSH) NuttX-10.1.0-RC1
nsh> microros /dev/ttyS1
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
INFO: rcl_wait timeout 0 ms
Sent: 0
Sent: 1
Sent: 2
Does not show up on my end. Instead this does:
nsh> microros
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS0'
���~�XRCEvF ~�XRCEvF
Which I have seen before when trying to run Zephyr or FreeRTOS while the board was trying to communicate to the agent. It would be fine if when running the agent there was more of an output than:
eden@eden-stealth:~$ sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
[1626880465.989759] info | TermiosAgentLinux.cpp | init | running... | fd: 3
[1626880465.989964] info | Root.cpp | set_verbose_level | logger setup | verbose_level: 6
or when i do ros2 topic list
it only returns:
/parameter_events
/rosout
PS: if you deem this a separate issue i can definitely make a new ticket instead
So it definitely suggests there is still a problem at RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator));
because through the messages i can sort of see that is outputed, it says problem at line 47, which happens to be that function.
Changed the serial channel just to see the full message Failed status on line 47: 1. Aborting.
which is coming from the RCCHECK
function defined on line 14.
nsh> microros
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
Failed status on line 47: 1. Aborting.
Additionally, upon changing the serial channel, I used a serial to usb adapter in order to pull that into my agent PC and run the agent from that port.
Commanded the agent using sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyUSB0 -v6
and still only receive what I mentioned above:
[1626894085.415025] info | TermiosAgentLinux.cpp | init | running... | fd: 3
[1626894085.415299] info | Root.cpp | set_verbose_level | logger setup | verbose_level: 6
Additionally, for more information, I did reconfigure USART6 similar to what I had done for USART3 with the kconfig-tweaks as you have recommended above
Your micro-ROS client is trying to connect to the micro-ROS Agent with the ���~�XRCEvF
messages... Is it possible for you to connect this port to the micro-ROS Agent? It should work, it seems that you have the same port for console and for micro-ROS.
Hi @pablogs9 so I did try to connect the uROS Agent to that port but I did not get the expected behaviour
eden@eden-stealth:~$ sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
[1626880465.989759] info | TermiosAgentLinux.cpp | init | running... | fd: 3
[1626880465.989964] info | Root.cpp | set_verbose_level | logger setup | verbose_level: 6
It gets hung up at this point and does not move forward. I also attempted using it through micro_ros_setup
instead of the docker images and it had the same behaviour. My attemptto use another port was to see if the board was getting confused with the agent trying to communicate with it whilst also having the nsh
terminal up. The behaviour was repeated in both instances.
Is there potentially a board dependency that is associated with the agent? I dont remember there being one but I could be mistaken. Like potentially having to specify somewhere which terminal exactly we are using not via the command line but through like on of the config files or something.
In general, your board should have one serial port for NSH and another for micro-ROS. Does your board have two serial ports?
Once you have configured two serial ports, make sure that you have access to NSH in one of them and run microros [the other serial port]
and check if you have binary traces (���~�XRCEvF
) on the other serial port.
Are you in this point?
Yes, so I did do this. Where I used USART3
as my serial port of nsh
and I used USART6
for my serial port for uROS
. I used a USB to serial adapter to pull out the pins out of my board and connect them directly to my PC running the agent. The behaviour I mentioned above with the agent not passing past the first two lines occur. With this message being sent Failed status on line 47: 1. Aborting.1
and line 47 for me is RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator));
Ok, if you open your uROS port using minicom
or miniterm
instead of the agent are you seeing anything being wrote?
So i opened up ttyUSB0
which happens to be my port, and I also simultaneously ran sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyUSB0 -v6
Opened up minicom minicom -b 115200 -D /dev/ttyUSB0
and see no values being printed.
Don't open the same port simultaneously in two processes, just in minicom and launch microros in NSH.
Are you sure that ttyUSB0
is your port? I usually search in /dev/serial/by-id/...
in order to avoid failures selecting the serial port.
Yeah i used dmesg
to make sure but for peace of mind, i just did /dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller-if00-port0
and the behaviour is imitated
Tested it with another adapter as well just in case.
My usart6 configuration and my overall serial driver support
I just toggled on Enable standard "upper-half" serial driver
as it seems to potentially apply. And unfortunately the same behaviour is occuring
For the sake of sanity: [this is from the board.h
file]
/* USART6 (Arduino Serial Shield) */
#define GPIO_USART6_RX GPIO_USART6_RX_2 /* PG9 */
#define GPIO_USART6_TX GPIO_USART6_TX_2 /* PG14 */
PG9
happens to be on CN10
pin 16
PG14
happens to be on CN10
pin 14
Just triple checked my wiring. My RS-232
adapter has pins 2 (RX
), 3 (TX
), 5 (GND
) going to the board.
RX
is connected to PG9
TX
is connected to PG14
GND
is connected to GND
I just swapped RX
and TX
on the RS-232
adapter. Same behaviour noted.
So hahahah,
I switched uart ports again and used UART4
. It started sending serial messages!!!!
However, that stack dump still persists, and this time it looks like the failure is occuring on node creation on line 51: RCCHECK(rclc_node_init_default(&node, "int32_publisher_rclc", "", &support));
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
arm_hardfault: PANIC!!! Hard fault: 40000000
up_assert: Assertion failed at file:armv7-m/arm_hardfault.c line: 135
up_registerdump: R0: 00000000 0803032c 00000000 00000000 38002f90 38002eb0 00000000 38002990
up_registerdump: R8: 00000000 00000000 00000000 00000000 38002108 38002990 0802740b 08014c1c
up_registerdump: xPSR: 21000000 PRIMASK: 00000000 CONTROL: 00000000
up_registerdump: EXC_RETURN: ffffffe9
up_dumpstate: sp: 38002808
up_dumpstate: stack base: 380014d0
up_dumpstate: stack size: 00001758
up_stackdump: 38002800: 38002808 380028bc 38002eb0 00000000 38002818 080084bd 00001758 380014d0
up_stackdump: 38002820: 38002808 380010c0 38002838 0800856b 00000087 00000000 00000087 0802cdc8
up_stackdump: 38002840: 38002848 08006063 00000087 0802cdc8 38002858 08001f05 00000001 00000000
up_stackdump: 38002860: 380028bc 00000003 38002870 687b0000 08014c1a 380028bc 38002880 080024a9
up_stackdump: 38002880: 380028bc 00000003 38002890 00000003 00000000 08001ea1 380028a0 08001e85
up_stackdump: 380028a0: 380028bc 00000003 380028b0 00000000 38002990 0800176d 00000010 38002990
up_stackdump: 380028c0: 00000000 38002f90 38002eb0 00000000 38002990 00000000 00000000 00000000
up_stackdump: 380028e0: 00000000 ffffffe9 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002900: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002920: 00000000 00000000 00000000 0803032c 00000000 00000000 38002108 0802740b
up_stackdump: 38002940: 08014c1c 21000000 6166e489 3fe90a7e 00000000 00000000 00000000 00000000
up_stackdump: 38002960: 00000000 00000000 00000000 00000000 6166e489 3fe90a7e ffc00000 41dfffff
up_stackdump: 38002980: 6134cf8c 6429f984 00000010 080071a3 0803032c 00000000 38002f90 380029a0
up_stackdump: 380029a0: 0803032c 24000114 380029b0 08013b11 00000014 38003060 380029c0 080273eb
up_stackdump: 380029c0: 0803032c 24000114 380029d0 0801a1f7 0803032c 24000114 38002b00 080273d9
up_stackdump: 380029e0: 38002a00 080184f9 24003fcc 24003ef0 380029f8 08007d9d 000000c8 24003ef0
up_stackdump: 38002a00: 38002ab0 38002f70 24000114 38002eb0 000000d0 00000003 38002f90 00000000
up_stackdump: 38002a20: 24001a00 24001a00 240024e4 00000014 38002a38 38002f40 38002f90 38002eb0
up_stackdump: 38002a40: 38002a60 08016351 38002b14 08016c71 00000000 00000000 00000000 38002a70
up_stackdump: 38002a60: 0802e394 24000114 38002b64 24001160 38002f70 08016c71 00000000 38002b00
up_stackdump: 38002a80: 00000001 00000000 38002b6c 00000000 38002aa0 08016c03 38002ab0 00000000
up_stackdump: 38002aa0: 0802e394 24000114 38002b64 24001160 00000001 0000000a 00000201 00000000
up_stackdump: 38002ac0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002ae0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002b00: 08016c21 08016c3b 08016c53 08016c71 00000000 00000000 24000800 00000000
up_stackdump: 38002b20: 00000000 00000000 38002b38 08016b4f 0802e8d8 00000000 0802e394 24000114
up_stackdump: 38002b40: 38002b64 24001160 38002b58 08015b85 38002bb8 00000000 380014b8 00000001
up_stackdump: 38002b60: 00000000 38002b78 38002df0 38002c30 38002c30 3fdac1fb 38002cd0 00000000
up_stackdump: 38002b80: 00000001 00000000 38002bb8 00000003 00000000 00000000 080244bb 00000000
up_stackdump: 38002ba0: 08016c21 08016c3b 08016c53 08016c71 00000000 38002bb8 08016c21 08016c3b
up_stackdump: 38002bc0: 08016c53 08016c71 00000000 00000000 00000000 00000000 00000000 0802e36c
up_stackdump: 38002be0: 38002c48 00000000 00000000 00000000 38002bf8 0800648d 00000000 380014b8
up_stackdump: 38002c00: 00000001 08015a8d 38002c10 08002cff 00000000 380010c0 00000001 00000001
up_stackdump: 38002c20: 00000000 00000000 00000060 80001790 08016c21 08016c3b 08016c53 08016c71
But honesty, it looks like it only really happens when the publisher is being created
// create publisher
RCCHECK(rclc_publisher_init_default(
&publisher,
&node,
ROSIDL_GET_MSG_TYPE_SUPPORT(std_msgs, msg, Int32),
"std_msgs_msg_Int32"));
Yeah I definitely have to say there is something wrong with the publisher because, without calling the create and fini functions for the publisher and just running it. There isnt a stack dump and I can actually see the node using ros2 node list
.
Is there a way of checking how much stack is assigned to a thread?
From the menuconfig:
Attaching a copy of my .config, file I will also try and circle back with some NuttX folks see if this is potentially caused by something else
This looks like it is occuring because the option Dump stack on assertions
is toggled on. I dont know if that provides any sort of help in providing what is happening but I toggled it off, and though it doesnt throw a stack dump it doesnt publish messages either.
@pablogs9 is it possible this could be causing problems, the nuttx guys pointed me to find the location of failure
rcl_jump_callback_info_t * callbacks = clock->allocator.reallocate(
clock->jump_callbacks, sizeof(rcl_jump_callback_info_t) * clock->num_jump_callbacks,
clock->allocator.state);
Happens to be in this file path /home/eden/nuttx/apps/microros/micro_ros_src/src/rcl/rcl/src/rcl/time.c:45
Hello @eden-desta, what's wrong with this reallocate?
This is one of the addresses for the stack dump @pablogs9
So I was able to build and such. Now I am on the
nsh
terminal and get this massive stack dumpI have not started to run the agent though I anticipated being able to start, exiting out the port and running the docker command.