micro-ROS / micro_ros_nuttx_app

An standalone micro-ROS app for Nuttx
11 stars 6 forks source link

arm_hardfault: PANIC!! #17

Open eden-desta opened 3 years ago

eden-desta commented 3 years ago

So I was able to build and such. Now I am on the nsh terminal and get this massive stack dump

nsh> microros
arm_hardfault: PANIC!!! Hard fault: 40000000
up_assert: Assertion failed at file:armv7-m/arm_hardfault.c line: 135
up_registerdump: R0: 00000010 38002810 00000010 00000000 24002798 00000010 0000000a 38002878
up_registerdump: R8: 00000000 00000000 00000000 00000000 0000000c 38002800 0801faf3 0801fb0a
up_registerdump: xPSR: 21000000 PRIMASK: 00000000 CONTROL: 00000000
up_registerdump: EXC_RETURN: ffffffe9
up_dumpstate: sp:         38002678
up_dumpstate: stack base: 38001450
up_dumpstate: stack size: 00001758
up_stackdump: 38002660: 38002ba8 38002678 38002660 38002660 38002678 3800272c 00000010 0000000a
up_stackdump: 38002680: 38002688 08008639 00001758 38001450 38002678 38001040 380026a8 08008705
up_stackdump: 380026a0: 00000087 00000000 00000087 080327b4 380026b8 080061db 00000087 080327b4
up_stackdump: 380026c0: 380026c8 08001efd deadbeef 00000000 3800272c 00000003 00000001 a9040000
up_stackdump: 380026e0: 0801fb08 3800272c 380026f0 08002621 3800272c 00000003 38002700 00000003
up_stackdump: 38002700: 00000000 08001e99 38002710 08001e77 3800272c 00000003 deadbeef 00000000
up_stackdump: 38002720: 38002878 08001759 deadbeef 38002800 00000000 24002798 00000010 0000000a
up_stackdump: 38002740: 38002878 00000000 00000000 00000000 00000000 ffffffe9 00000000 00000000
up_stackdump: 38002760: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002780: 00000000 00000000 00000000 00000000 00000000 00000000 00000010 38002810
up_stackdump: 380027a0: 00000010 00000000 0000000c 0801faf3 0801fb0a 21000000 7951843f 3fabda49
up_stackdump: 380027c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 380027e0: 7951843f 3fabda49 ffc00000 41dfffff 7919cfac 06f6925e 00000010 0801faed
up_stackdump: 38002800: 00000008 24003e90 38002cc8 00000000 00000000 00000000 00040103 feff0200
up_stackdump: 38002820: 38002810 38002820 38002820 00000000 00000010 38000101 00000000 00000000
up_stackdump: 38002840: 38002848 08007919 00000000 38002cd0 38002858 080193fb 00000000 38002cd0
up_stackdump: 38002860: 38002868 380028cc 38002c60 00000000 38002878 0801ae71 08019403 38002c78
up_stackdump: 38002880: 00000000 00000000 00000000 00000000 380028a0 080265dd 00000005 38002d30
up_stackdump: 380028a0: 00000000 00000000 38002d30 240024f8 00000000 38002ad8 24002450 080193d1
up_stackdump: 380028c0: 080193eb 08019403 08019421 00000000 00000000 00000000 ffffffff 38002d30
up_stackdump: 380028e0: 00000000 38002ae0 38002c78 38002ce8 38002c50 00000000 380029a4 38002c60
up_stackdump: 38002900: 38002918 00000000 00000000 08026d39 00000005 3800299c 00000000 00000000
up_stackdump: 38002920: 00000000 08032663 00000000 00000000 00000000 ffffffff 38002800 00000000
up_stackdump: 38002940: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002960: 00000000 00000000 38002ad8 38002ad0 00000000 00000000 00000001 00000000
up_stackdump: 38002980: 00000000 240009c8 00000000 00000005 00000000 080193d1 080193eb 08019403
up_stackdump: 380029a0: 08019421 00000000 380029b0 00000001 38002cf8 38002d00 38002cf4 38002ae0
up_stackdump: 380029c0: 00000001 00000000 00000000 00000000 deadbeef 380029f8 00000000 00000001
up_stackdump: 380029e0: 38002ad8 00000000 380029f8 00000000 00000000 08019085 00000000 00000000
up_stackdump: 38002a00: 00000000 00000000 38002acc 00000000 00000000 38002ad0 08035dc0 00000000
up_stackdump: 38002a20: 38002bc8 38002acc 38002a38 08015d39 38002b18 00000000 38001438 00000001
up_stackdump: 38002a40: deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef
up_stackdump: 38002a60: deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef
up_stackdump: 38002a80: deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef
up_stackdump: 38002aa0: deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef deadbeef
up_stackdump: 38002ac0: deadbeef deadbeef deadbeef 38002bb0 38002bb0 080105df 38002c50 00000000
up_stackdump: 38002ae0: 00000000 00000000 38003070 00000001 38003070 38002c50 00000001 38003070
up_stackdump: 38002b00: 38002b53 00000004 38002bc8 38002c50 00000001 08011421 080193d1 080193eb
up_stackdump: 38002b20: 08019403 08019421 00000000 0801ec5b 380030f0 0801ec33 38002f80 0800faf5
up_stackdump: 38002b40: 00000000 00000000 00000000 00000000 00000000 38002bc8 00000000 08033d2c
up_stackdump: 38002b60: 00000000 00000000 38002b78 00000000 00000000 08006605 00000000 38001438
up_stackdump: 38002b80: 00000001 08015c75 38002b90 08002e77 00000000 38001040 00000001 00000001
up_stackdump: 38002ba0: 00000000 00000000 00000060 80001790 080193d1 080193eb 08019403 08019421

I have not started to run the agent though I anticipated being able to start, exiting out the port and running the docker command.

sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
pablogs9 commented 3 years ago

Is possible to check in which line is this happening?

eden-desta commented 3 years ago

It looks like it is failing: RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator)); At this point in the main. I ran it in chunks Essentially ensuring each line didnt cause the stack dump. This is the location of failure.

pablogs9 commented 3 years ago

Which board are you using? Does micro-ROS outputs something in the used serial port?

pablogs9 commented 3 years ago

Are you doing this configuration for your board?

kconfig-tweak --enable CONFIG_MICROROSLIB
kconfig-tweak --enable CONFIG_SERIAL_TERMIOS
kconfig-tweak --enable CONFIG_STM32L4_UART4
kconfig-tweak --enable CONFIG_STM32L4_UART4_SERIALDRIVER
kconfig-tweak --enable CONFIG_UART4_SERIALDRIVER
kconfig-tweak --set-val CONFIG_UART4_RXBUFSIZE 256
kconfig-tweak --set-val CONFIG_UART4_TXBUFSIZE 256
kconfig-tweak --set-val CONFIG_UART4_BAUD 115200
kconfig-tweak --set-val CONFIG_UART4_BITS 8
kconfig-tweak --set-val CONFIG_UART4_PARITY 0
kconfig-tweak --set-val CONFIG_UART4_2STOP 0
pablogs9 commented 3 years ago

Have you check that you are using the correct serial port here?

eden-desta commented 3 years ago

so i do have all those configurations @pablogs9 the only difference is i replaced L4 with H7 and replaced UART4 with USART3.

And what should I be replacing the /dev/ttyS1 with? My STM board is currently registered as dev/ttyACM0

eden-desta commented 3 years ago

ls /dev produced ttyS0 so I replaced it with that and it looks like it is getting past the initial fail point. Rebuilding and rerunning to make sure everything is good!

WIll run the agent as well sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6

And see if any errors occur! Thank you @pablogs9

eden-desta commented 3 years ago

@pablogs9 Is it a problem that I am flashing with usart3 and also communicating over it as well for the agent?

Because this output that you see

NuttShell (NSH) NuttX-10.1.0-RC1
nsh> microros /dev/ttyS1
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
INFO: rcl_wait timeout 0 ms
Sent: 0
Sent: 1
Sent: 2

Does not show up on my end. Instead this does:

nsh> microros
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS0'
���~�XRCEvF                                                        ~�XRCEvF

Which I have seen before when trying to run Zephyr or FreeRTOS while the board was trying to communicate to the agent. It would be fine if when running the agent there was more of an output than:

eden@eden-stealth:~$ sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
[1626880465.989759] info     | TermiosAgentLinux.cpp | init                     | running...             | fd: 3
[1626880465.989964] info     | Root.cpp           | set_verbose_level        | logger setup           | verbose_level: 6

or when i do ros2 topic list it only returns:

/parameter_events
/rosout

PS: if you deem this a separate issue i can definitely make a new ticket instead

eden-desta commented 3 years ago

So it definitely suggests there is still a problem at RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator)); because through the messages i can sort of see that is outputed, it says problem at line 47, which happens to be that function.

Changed the serial channel just to see the full message Failed status on line 47: 1. Aborting. which is coming from the RCCHECK function defined on line 14.

nsh> microros
micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
Failed status on line 47: 1. Aborting.
eden-desta commented 3 years ago

Additionally, upon changing the serial channel, I used a serial to usb adapter in order to pull that into my agent PC and run the agent from that port. Commanded the agent using sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyUSB0 -v6 and still only receive what I mentioned above:

[1626894085.415025] info     | TermiosAgentLinux.cpp | init                     | running...             | fd: 3
[1626894085.415299] info     | Root.cpp           | set_verbose_level        | logger setup           | verbose_level: 6

Additionally, for more information, I did reconfigure USART6 similar to what I had done for USART3 with the kconfig-tweaks as you have recommended above

pablogs9 commented 3 years ago

Your micro-ROS client is trying to connect to the micro-ROS Agent with the ���~�XRCEvF messages... Is it possible for you to connect this port to the micro-ROS Agent? It should work, it seems that you have the same port for console and for micro-ROS.

eden-desta commented 3 years ago

Hi @pablogs9 so I did try to connect the uROS Agent to that port but I did not get the expected behaviour

eden@eden-stealth:~$ sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyACM0 -v6
[1626880465.989759] info     | TermiosAgentLinux.cpp | init                     | running...             | fd: 3
[1626880465.989964] info     | Root.cpp           | set_verbose_level        | logger setup           | verbose_level: 6

It gets hung up at this point and does not move forward. I also attempted using it through micro_ros_setup instead of the docker images and it had the same behaviour. My attemptto use another port was to see if the board was getting confused with the agent trying to communicate with it whilst also having the nsh terminal up. The behaviour was repeated in both instances.

Is there potentially a board dependency that is associated with the agent? I dont remember there being one but I could be mistaken. Like potentially having to specify somewhere which terminal exactly we are using not via the command line but through like on of the config files or something.

pablogs9 commented 3 years ago

In general, your board should have one serial port for NSH and another for micro-ROS. Does your board have two serial ports?

Once you have configured two serial ports, make sure that you have access to NSH in one of them and run microros [the other serial port] and check if you have binary traces (���~�XRCEvF) on the other serial port.

Are you in this point?

eden-desta commented 3 years ago

Yes, so I did do this. Where I used USART3 as my serial port of nsh and I used USART6 for my serial port for uROS. I used a USB to serial adapter to pull out the pins out of my board and connect them directly to my PC running the agent. The behaviour I mentioned above with the agent not passing past the first two lines occur. With this message being sent Failed status on line 47: 1. Aborting.1 and line 47 for me is RCCHECK(rclc_support_init_with_options(&support, 0, NULL, &init_options, &allocator));

pablogs9 commented 3 years ago

Ok, if you open your uROS port using minicom or miniterm instead of the agent are you seeing anything being wrote?

eden-desta commented 3 years ago

So i opened up ttyUSB0 which happens to be my port, and I also simultaneously ran sudo docker run -it --rm -v /dev:/dev --privileged --net=host microros/micro-ros-agent:galactic serial --dev /dev/ttyUSB0 -v6 Opened up minicom minicom -b 115200 -D /dev/ttyUSB0 and see no values being printed.

pablogs9 commented 3 years ago

Don't open the same port simultaneously in two processes, just in minicom and launch microros in NSH.

pablogs9 commented 3 years ago

Are you sure that ttyUSB0 is your port? I usually search in /dev/serial/by-id/... in order to avoid failures selecting the serial port.

eden-desta commented 3 years ago

Yeah i used dmesg to make sure but for peace of mind, i just did /dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller-if00-port0 and the behaviour is imitated

Tested it with another adapter as well just in case.

eden-desta commented 3 years ago

My usart6 configuration and my overall serial driver support

Screenshot from 2021-07-22 10-40-56

Screenshot from 2021-07-22 10-39-42

eden-desta commented 3 years ago

I just toggled on Enable standard "upper-half" serial driver as it seems to potentially apply. And unfortunately the same behaviour is occuring

eden-desta commented 3 years ago

For the sake of sanity: [this is from the board.h file]

/* USART6 (Arduino Serial Shield) */

#define GPIO_USART6_RX     GPIO_USART6_RX_2  /* PG9 */
#define GPIO_USART6_TX     GPIO_USART6_TX_2  /* PG14 */

PG9 happens to be on CN10 pin 16 PG14 happens to be on CN10 pin 14

Just triple checked my wiring. My RS-232 adapter has pins 2 (RX), 3 (TX), 5 (GND) going to the board.

RX is connected to PG9 TX is connected to PG14 GND is connected to GND

I just swapped RX and TX on the RS-232 adapter. Same behaviour noted.

eden-desta commented 3 years ago

So hahahah, I switched uart ports again and used UART4. It started sending serial messages!!!! However, that stack dump still persists, and this time it looks like the failure is occuring on node creation on line 51: RCCHECK(rclc_node_init_default(&node, "int32_publisher_rclc", "", &support));

micro-ROS transport: connected using serial mode, dev: '/dev/ttyS1'
arm_hardfault: PANIC!!! Hard fault: 40000000
up_assert: Assertion failed at file:armv7-m/arm_hardfault.c line: 135
up_registerdump: R0: 00000000 0803032c 00000000 00000000 38002f90 38002eb0 00000000 38002990
up_registerdump: R8: 00000000 00000000 00000000 00000000 38002108 38002990 0802740b 08014c1c
up_registerdump: xPSR: 21000000 PRIMASK: 00000000 CONTROL: 00000000
up_registerdump: EXC_RETURN: ffffffe9
up_dumpstate: sp:         38002808
up_dumpstate: stack base: 380014d0
up_dumpstate: stack size: 00001758
up_stackdump: 38002800: 38002808 380028bc 38002eb0 00000000 38002818 080084bd 00001758 380014d0
up_stackdump: 38002820: 38002808 380010c0 38002838 0800856b 00000087 00000000 00000087 0802cdc8
up_stackdump: 38002840: 38002848 08006063 00000087 0802cdc8 38002858 08001f05 00000001 00000000
up_stackdump: 38002860: 380028bc 00000003 38002870 687b0000 08014c1a 380028bc 38002880 080024a9
up_stackdump: 38002880: 380028bc 00000003 38002890 00000003 00000000 08001ea1 380028a0 08001e85
up_stackdump: 380028a0: 380028bc 00000003 380028b0 00000000 38002990 0800176d 00000010 38002990
up_stackdump: 380028c0: 00000000 38002f90 38002eb0 00000000 38002990 00000000 00000000 00000000
up_stackdump: 380028e0: 00000000 ffffffe9 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002900: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002920: 00000000 00000000 00000000 0803032c 00000000 00000000 38002108 0802740b
up_stackdump: 38002940: 08014c1c 21000000 6166e489 3fe90a7e 00000000 00000000 00000000 00000000
up_stackdump: 38002960: 00000000 00000000 00000000 00000000 6166e489 3fe90a7e ffc00000 41dfffff
up_stackdump: 38002980: 6134cf8c 6429f984 00000010 080071a3 0803032c 00000000 38002f90 380029a0
up_stackdump: 380029a0: 0803032c 24000114 380029b0 08013b11 00000014 38003060 380029c0 080273eb
up_stackdump: 380029c0: 0803032c 24000114 380029d0 0801a1f7 0803032c 24000114 38002b00 080273d9
up_stackdump: 380029e0: 38002a00 080184f9 24003fcc 24003ef0 380029f8 08007d9d 000000c8 24003ef0
up_stackdump: 38002a00: 38002ab0 38002f70 24000114 38002eb0 000000d0 00000003 38002f90 00000000
up_stackdump: 38002a20: 24001a00 24001a00 240024e4 00000014 38002a38 38002f40 38002f90 38002eb0
up_stackdump: 38002a40: 38002a60 08016351 38002b14 08016c71 00000000 00000000 00000000 38002a70
up_stackdump: 38002a60: 0802e394 24000114 38002b64 24001160 38002f70 08016c71 00000000 38002b00
up_stackdump: 38002a80: 00000001 00000000 38002b6c 00000000 38002aa0 08016c03 38002ab0 00000000
up_stackdump: 38002aa0: 0802e394 24000114 38002b64 24001160 00000001 0000000a 00000201 00000000
up_stackdump: 38002ac0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002ae0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
up_stackdump: 38002b00: 08016c21 08016c3b 08016c53 08016c71 00000000 00000000 24000800 00000000
up_stackdump: 38002b20: 00000000 00000000 38002b38 08016b4f 0802e8d8 00000000 0802e394 24000114
up_stackdump: 38002b40: 38002b64 24001160 38002b58 08015b85 38002bb8 00000000 380014b8 00000001
up_stackdump: 38002b60: 00000000 38002b78 38002df0 38002c30 38002c30 3fdac1fb 38002cd0 00000000
up_stackdump: 38002b80: 00000001 00000000 38002bb8 00000003 00000000 00000000 080244bb 00000000
up_stackdump: 38002ba0: 08016c21 08016c3b 08016c53 08016c71 00000000 38002bb8 08016c21 08016c3b
up_stackdump: 38002bc0: 08016c53 08016c71 00000000 00000000 00000000 00000000 00000000 0802e36c
up_stackdump: 38002be0: 38002c48 00000000 00000000 00000000 38002bf8 0800648d 00000000 380014b8
up_stackdump: 38002c00: 00000001 08015a8d 38002c10 08002cff 00000000 380010c0 00000001 00000001
up_stackdump: 38002c20: 00000000 00000000 00000060 80001790 08016c21 08016c3b 08016c53 08016c71

But honesty, it looks like it only really happens when the publisher is being created

// create publisher
    RCCHECK(rclc_publisher_init_default(
        &publisher,
        &node,
        ROSIDL_GET_MSG_TYPE_SUPPORT(std_msgs, msg, Int32),
        "std_msgs_msg_Int32"));
eden-desta commented 3 years ago

Yeah I definitely have to say there is something wrong with the publisher because, without calling the create and fini functions for the publisher and just running it. There isnt a stack dump and I can actually see the node using ros2 node list.

pablogs9 commented 3 years ago

Is there a way of checking how much stack is assigned to a thread?

eden-desta commented 3 years ago

From the menuconfig: Screenshot from 2021-07-23 08-19-36

eden-desta commented 3 years ago

Config.txt

Attaching a copy of my .config, file I will also try and circle back with some NuttX folks see if this is potentially caused by something else

eden-desta commented 3 years ago

This looks like it is occuring because the option Dump stack on assertions is toggled on. I dont know if that provides any sort of help in providing what is happening but I toggled it off, and though it doesnt throw a stack dump it doesnt publish messages either.

eden-desta commented 3 years ago

@pablogs9 is it possible this could be causing problems, the nuttx guys pointed me to find the location of failure

 rcl_jump_callback_info_t * callbacks = clock->allocator.reallocate(
      clock->jump_callbacks, sizeof(rcl_jump_callback_info_t) * clock->num_jump_callbacks,
      clock->allocator.state);

Happens to be in this file path /home/eden/nuttx/apps/microros/micro_ros_src/src/rcl/rcl/src/rcl/time.c:45

pablogs9 commented 3 years ago

Hello @eden-desta, what's wrong with this reallocate?

eden-desta commented 3 years ago

This is one of the addresses for the stack dump @pablogs9