v-kiniv / rws

WebSocket gateway for ROS2 topics and services
Apache License 2.0
10 stars 7 forks source link

Odd crashes #20

Closed MoffKalast closed 1 month ago

MoffKalast commented 8 months ago

While I was testing the vizanti occupancy grid issues I had two odd crashes that might be a result of the recent patches. There's not much to go on, but anyway here are the logs in case it helps:

[rws_server-1] [INFO] [1709824461.473167570] [rws::translate]: Field 'sec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824461.473193070] [rws::translate]: Field 'nanosec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824489.367046726] [rws::translate]: Field 'sec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824489.367082067] [rws::translate]: Field 'nanosec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824500.429303172] [rws::translate]: Field 'sec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824500.429338840] [rws::translate]: Field 'nanosec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824544.056496212] [rws::translate]: Field 'sec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824544.056526996] [rws::translate]: Field 'nanosec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824558.377610503] [rws::translate]: Field 'sec' is not in json, default: (nil)
[rws_server-1] [INFO] [1709824558.377641409] [rws::translate]: Field 'nanosec' is not in json, default: (nil)
[ERROR] [rws_server-1]: process has died [pid 135962, exit code -11, cmd '/home/vid/colcon_ws/install/rws/lib/rws/rws_server --ros-args -r __node:=vizanti_rws_server --params-file /tmp/launch_params_ng8p7y2d --params-file /tmp/launch_params_t1imf5c3 --params-file /tmp/launch_params_hlq1qkk5'].
[rws_server-1] terminate called after throwing an instance of 'std::runtime_error'
[rws_server-1]   what():  string data is not null-terminated
[ERROR] [rws_server-1]: process has died [pid 9874, exit code -6, cmd '/home/osboxes/colcon_ws/install/rws/lib/rws/rws_server --ros-args -r __node:=vizanti_rws_server --params-file /tmp/launch_params_l861ncw7 --params-file /tmp/launch_params_jdx9glro --params-file /tmp/launch_params_vthzcqqw'].

I'm not sure what exactly I was doing while it happened, but they were both on cyclonedds in case that's relevant.

v-kiniv commented 8 months ago

How often that happens? Can you try to reproduce it? I noticed that Field 'sec' is not in json, default: (nil) caused by 2D Nav Goal tool in Vizanti, but can't reproduce crash with it. what(): string data is not null-terminated - can you point me what part of Vizanti is sending messages with string? Maybe it could help narrow down the culprit.

MoffKalast commented 8 months ago

It doesn't happen super often and I haven't been able to reliably reproduce it yet, but it did happen again today:

[rws_server-1] [2024-03-09 05:33:22] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[rws_server-1] [2024-03-09 05:33:22] [info] asio async_write error: asio.system:32 (Broken pipe)
[rws_server-1] [2024-03-09 05:33:22] [fatal] handle_write_frame error: websocketpp.transport:2 (Underlying Transport Error)
[rws_server-1] [INFO] [1709980402.070906949] [vizanti_rws_server]: Closing connection with client_id 1
[rws_server-1] [2024-03-09 05:33:22] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[rws_server-1] [INFO] [1709980402.070970871] [client_handler_1]: Destroying client 1(16726299949654779313)
[ERROR] [rws_server-1]: process has died [pid 8522, exit code -11, cmd '/home/osboxes/colcon_ws/install/rws/lib/rws/rws_server --ros-args -r __node:=vizanti_rws_server --params-file /tmp/launch_params_8c6jimp0 --params-file /tmp/launch_params_x59tkrq7 --params-file /tmp/launch_params_ww4v8gu1'].

I'm not sure if the watchdog printouts were related at all, but it did seem to start happening after I turned it on recently. Could just be a correlation but I'll disable it again and see if I still get any.

caused by 2D Nav Goal tool in Vizanti, but can't reproduce crash with it.

I think I've also seen that happen when the nav_msgs/Path is sent from the route planner, but that may be unrelated and I need to take some time to look into it properly if it's a genuine issue.

can you point me what part of Vizanti is sending messages with string? Maybe it could help narrow down the culprit.

Afaik there aren't any topic publishers that send strings, it might be one of the services. Practically all of them are some kind of string or string array. Could also be something completely internal.

v-kiniv commented 8 months ago

This may also be specific to Humble, I tested mostly on Galactic. Need to test more on Humble. Can you share your setup, OS, arch, network, does it crash in VM only, real hardware only, both?

If you are comfortable with C++ tools, it would be great to attach GDB to RWS node and wait till it crash, stack trace would clear things up.

MoffKalast commented 8 months ago

Could be Humble specific, I think all of these happened on x86_64 arch, Kubuntu 22.04 with Humble, one in VirtualBox, one native install. I've also been testing on aarch64 on a Pi 4, but I don't think I seen any there yet.

it would be great to attach GDB to RWS node and wait till it crash, stack trace would clear things up.

I'll see if I can set that up.

v-kiniv commented 8 months ago

I created a js script to stress test the server, by creating a lot of unresponsive clients that don't always respond to ping and terminate without closing TCP connection gracefully. I was able to reproduce crash with watchdog enabled. While these may not be all of the cases you've encountered, this is definitely one of them. Basically the server was trying to close TCP connection and dispose client that didn't respond to ping in time, but for which TCP connection was closed just before 'pong timeout' event.

Please check https://github.com/v-kiniv/rws/pull/23

MoffKalast commented 7 months ago

I haven't done much more testing with ROS 2 these recent weeks, but so far I haven't seen any more crashes. It might have genuinely been just the watchdog and fully fixed in #23. I'll reopen if I manage to reproduce it consistently again.

MoffKalast commented 1 month ago

Hey so, sorry to reopen this, but while I was testing the #37 fix on humble (which works perfectly btw, thanks for that) I ran into another crash. So I set up GDB this time and I think I've also been able to pinpoint what reproduces it reliably: it happens when you either subscribe too quickly to the same topic in succession or when changing a socket throttle parameter. Taking the laserscan widget in Vizanti that has a throttle param and flipping through values quickly which causes a resubscribe every time (maybe I need to put a delay on that lol) seems to cause a segfault very rapidly.

Here's what I ran to get the trace:

ros2 run  --prefix 'gdb --args' rws rws_server --ros-args -p rosbridge_compatible:=True -p port:=5001 -p watchdog:=True

First run:

GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/ubuntu/colcon_ws/install/rws/lib/rws/rws_server...
(gdb) run
Starting program: /home/ubuntu/colcon_ws/install/rws/lib/rws/rws_server --ros-args -p rosbridge_compatible:=True -p port:=5001 -p watchdog:=True
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff65ff640 (LWP 27897)]
[New Thread 0x7ffff5dfe640 (LWP 27898)]
[New Thread 0x7ffff55fd640 (LWP 27899)]
[New Thread 0x7ffff4dfc640 (LWP 27900)]
[New Thread 0x7ffff45fb640 (LWP 27901)]
[New Thread 0x7ffff3dfa640 (LWP 27902)]
[New Thread 0x7ffff35f9640 (LWP 27903)]
[New Thread 0x7ffff2df8640 (LWP 27904)]
2024-10-11 13:53:41.935 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port29161: open_and_lock_file failed -> Function open_port_internal
[New Thread 0x7ffff25f7640 (LWP 27905)]
[New Thread 0x7ffff1df6640 (LWP 27906)]
[New Thread 0x7ffff15f5640 (LWP 27907)]
[INFO] [1728647621.964115310] [rws_server]: RWS start listening on port 5001
[New Thread 0x7fffdbfff640 (LWP 27908)]
[New Thread 0x7fffdb7fe640 (LWP 27909)]
[New Thread 0x7fffdaffd640 (LWP 27910)]
[New Thread 0x7fffda7fc640 (LWP 27911)]
[New Thread 0x7fffd9ffb640 (LWP 27912)]
[New Thread 0x7fffd97fa640 (LWP 27913)]
[New Thread 0x7fffd8ff9640 (LWP 27914)]
[New Thread 0x7fffcffff640 (LWP 27915)]
[New Thread 0x7fffcf7fe640 (LWP 27916)]
[New Thread 0x7fffceffd640 (LWP 27917)]
[New Thread 0x7fffce7fc640 (LWP 27918)]
[New Thread 0x7fffcdffb640 (LWP 27919)]
[New Thread 0x7fffcd7fa640 (LWP 27920)]
[New Thread 0x7fffccff9640 (LWP 27921)]
[New Thread 0x7fffcc7f8640 (LWP 27922)]
[New Thread 0x7fffcbff7640 (LWP 27923)]
[New Thread 0x7fffcb7f6640 (LWP 27924)]
[New Thread 0x7fffcaff5640 (LWP 27925)]
[New Thread 0x7fffca7f4640 (LWP 27926)]
[New Thread 0x7fffc9ff3640 (LWP 27927)]
[New Thread 0x7fffc97f2640 (LWP 27928)]
[2024-10-11 13:53:43] [connect] WebSocket Connection [::1]:35570 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647623.168362652] [client_handler_0]: Constructing client 0(245633643150608920)
[2024-10-11 13:53:43] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:53:43] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647623.196325422] [rws_server]: Closing connection with client_id 0
[INFO] [1728647623.196442046] [client_handler_0]: Destroying client 0(245633643150608920)
[2024-10-11 13:53:46] [connect] WebSocket Connection [::1]:35586 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647626.580244186] [client_handler_1]: Constructing client 1(245633643150608920)
[ERROR] [1728647626.701746974] [client_handler_1]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:53:51] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:53:51] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647631.680322384] [rws_server]: Closing connection with client_id 1
[INFO] [1728647631.680436889] [client_handler_1]: Destroying client 1(245633643150608920)
[2024-10-11 13:53:52] [connect] WebSocket Connection [::1]:38198 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647632.890633996] [client_handler_2]: Constructing client 2(245633643150608920)
[ERROR] [1728647632.998889531] [client_handler_2]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:47] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:47] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647747.039830123] [rws_server]: Closing connection with client_id 2
[INFO] [1728647747.039927726] [client_handler_2]: Destroying client 2(245633643150608920)
[2024-10-11 13:55:47] [connect] WebSocket Connection [::1]:43862 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647747.110029832] [client_handler_3]: Constructing client 3(245633643150608920)
[ERROR] [1728647747.288974883] [client_handler_3]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:47] [error] handle_read_frame error: asio.system:104 (Connection reset by peer)
[2024-10-11 13:55:47] [disconnect] Disconnect close local:[1006,Connection reset by peer] remote:[1001]
[INFO] [1728647747.606127391] [rws_server]: Closing connection with client_id 3
[2024-10-11 13:55:47] [info] asio async_write error: asio.system:32 (Broken pipe)
[2024-10-11 13:55:47] [fatal] handle_write_frame error: websocketpp.transport:2 (Underlying Transport Error)
[INFO] [1728647747.606178227] [client_handler_3]: Destroying client 3(245633643150608920)
[2024-10-11 13:55:47] [connect] WebSocket Connection [::1]:43864 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647747.642712027] [client_handler_4]: Constructing client 4(245633643150608920)
[ERROR] [1728647747.816324882] [client_handler_4]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:47] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:47] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647747.950737680] [rws_server]: Closing connection with client_id 4
[INFO] [1728647747.950784993] [client_handler_4]: Destroying client 4(245633643150608920)
[2024-10-11 13:55:47] [connect] WebSocket Connection [::1]:43872 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647747.990810779] [client_handler_5]: Constructing client 5(245633643150608920)
[ERROR] [1728647748.153726127] [client_handler_5]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:48] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:48] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647748.206277431] [rws_server]: Closing connection with client_id 5
[INFO] [1728647748.206339663] [client_handler_5]: Destroying client 5(245633643150608920)
[2024-10-11 13:55:48] [connect] WebSocket Connection [::1]:43880 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647748.240234638] [client_handler_6]: Constructing client 6(245633643150608920)
[ERROR] [1728647748.383527416] [client_handler_6]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:48] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:48] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647748.481852910] [rws_server]: Closing connection with client_id 6
[INFO] [1728647748.481901440] [client_handler_6]: Destroying client 6(245633643150608920)
[2024-10-11 13:55:48] [connect] WebSocket Connection [::1]:43886 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647748.513813299] [client_handler_7]: Constructing client 7(245633643150608920)
[ERROR] [1728647748.656640536] [client_handler_7]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:48] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:48] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647748.740957664] [rws_server]: Closing connection with client_id 7
[INFO] [1728647748.741018340] [client_handler_7]: Destroying client 7(245633643150608920)
[2024-10-11 13:55:48] [connect] WebSocket Connection [::1]:43896 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647748.782356824] [client_handler_8]: Constructing client 8(245633643150608920)
[ERROR] [1728647748.933465674] [client_handler_8]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:48] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:48] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647748.991231741] [rws_server]: Closing connection with client_id 8
[INFO] [1728647748.991273144] [client_handler_8]: Destroying client 8(245633643150608920)
[2024-10-11 13:55:49] [connect] WebSocket Connection [::1]:43900 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647749.023820055] [client_handler_9]: Constructing client 9(245633643150608920)
[ERROR] [1728647749.190003945] [client_handler_9]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:49] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:49] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647749.268889313] [rws_server]: Closing connection with client_id 9
[INFO] [1728647749.268942619] [client_handler_9]: Destroying client 9(245633643150608920)
[2024-10-11 13:55:49] [connect] WebSocket Connection [::1]:39998 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647749.300612733] [client_handler_10]: Constructing client 10(245633643150608920)
[ERROR] [1728647749.435155143] [client_handler_10]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:49] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:49] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647749.569988239] [rws_server]: Closing connection with client_id 10
[INFO] [1728647749.570038931] [client_handler_10]: Destroying client 10(245633643150608920)
[2024-10-11 13:55:49] [connect] WebSocket Connection [::1]:40008 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647749.613171258] [client_handler_11]: Constructing client 11(245633643150608920)
[ERROR] [1728647749.755249645] [client_handler_11]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[2024-10-11 13:55:49] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 13:55:49] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728647749.874339592] [rws_server]: Closing connection with client_id 11
[INFO] [1728647749.874386578] [client_handler_11]: Destroying client 11(245633643150608920)
[2024-10-11 13:55:49] [connect] WebSocket Connection [::1]:40022 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728647749.913093698] [client_handler_12]: Constructing client 12(245633643150608920)
[ERROR] [1728647750.051906327] [client_handler_12]: Failed to subscribe to topic: "Topic /amcl_pose not found"
[ERROR] [1728647757.595207939] [client_handler_12]: Failed to subscribe to topic: "Topic /amcl_pose not found"

Thread 1 "rws_server" received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:416
416     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:416
#1  0x00005555555fbe67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (this=0x5555559d0550, 
    __beg=0xd3a1aeb1285636af <error: Cannot access memory at address 0xd3a1aeb1285636af>, 
    __end=0xd3a1aeb1285636bd <error: Cannot access memory at address 0xd3a1aeb1285636bd>) at /usr/include/c++/11/bits/basic_string.tcc:225
#2  0x00005555555a3157 in __gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (
    this=0x7fffffff9527, __p=0x5555559d0550) at /usr/include/c++/11/ext/new_allocator.h:162
#3  0x0000555555598ec9 in std::allocator_traits<std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (__a=..., __p=0x5555559d0550) at /usr/include/c++/11/bits/alloc_traits.h:516
#4  0x000055555559018b in nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >::create<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> () at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/json.hpp:388
#5  0x0000555555587834 in nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >::json_value::json_value (this=0x7fffffff95a0, value="") at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/json.hpp:525
#6  0x00005555555a81a1 in nlohmann::json_abi_v3_11_2::detail::external_constructor<(nlohmann::json_abi_v3_11_2::detail::value_t)3>::construct<nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > > > (j=..., s="")
    at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/detail/conversions/to_json.hpp:65
#7  0x00005555555a069f in nlohmann::json_abi_v3_11_2::detail::to_json<nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, 0> (j=..., s="")
    at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/detail/conversions/to_json.hpp:287
#8  0x0000555555597484 in nlohmann::json_abi_v3_11_2::detail::to_json_fn::operator()<nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (
    this=0x5555556b2199 <nlohmann::json_abi_v3_11_2::detail::static_const<nlohmann::json_abi_v3_11_2::detail::to_json_fn>::value>, j=..., val="")
    at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/detail/conversions/to_json.hpp:428
#9  0x0000555555589ef2 in nlohmann::json_abi_v3_11_2::adl_serializer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>::to_json<nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (j=..., val="")
    at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/adl_serializer.hpp:51
#10 0x0000555555582541 in nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >::basic_json<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, 0> (this=0x7fffffff97e8, val="") at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/json.hpp:829
#11 0x0000555555580430 in nlohmann::json_abi_v3_11_2::detail::json_ref<nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > > >::json_ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, 0> (
    this=0x7fffffff97e8) at /home/ubuntu/colcon_ws/build/rws/_deps/json-src/include/nlohmann/detail/json_ref.hpp:43
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x0000555555570797 in operator() (__closure=0x7fffb41b62f0, message=std::shared_ptr<const rclcpp::SerializedMessage> (use count 3, weak count 0) = {...})
    at /home/ubuntu/colcon_ws/src/rws/src/client_handler.cpp:166
#13 0x00005555555782da in std::__invoke_impl<void, rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)>&, std::shared_ptr<const rclcpp::SerializedMessage> >(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#14 0x0000555555577f4d in std::__invoke_r<void, rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)>&, std::shared_ptr<const rclcpp::SerializedMessage> >(struct {...} &) (__fn=...) at /usr/include/c++/11/bits/invoke.h:111
#15 0x0000555555577a8c in std::_Function_handler<void(std::shared_ptr<const rclcpp::SerializedMessage>), rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)> >::_M_invoke(const std::_Any_data &, std::shared_ptr<rclcpp::SerializedMessage const> &&) (
    __functor=..., __args#0=...) at /usr/include/c++/11/bits/std_function.h:290
#16 0x000055555558230b in std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (this=0x7fffb4075150, __args#0=std::shared_ptr<const rclcpp::SerializedMessage> (empty) = {...}) at /usr/include/c++/11/bits/std_function.h:590
#17 0x0000555555580c8f in rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (__closure=0x7fffb418cd60, message=std::shared_ptr<const rclcpp::SerializedMessage> (use count 3, weak count 0) = {...})
    at /home/ubuntu/colcon_ws/src/rws/include/rws/connector.hpp:69
#18 0x00005555555a7d84 in std::__invoke_impl<void, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const> >(std::__invoke_other, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (
    __f=...) at /usr/include/c++/11/bits/invoke.h:61
#19 0x000055555559fe91 in std::__invoke_r<void, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const> >(rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (__fn=...)
    at /usr/include/c++/11/bits/invoke.h:111
#20 0x0000555555594c27 in std::_Function_handler<void (std::shared_ptr<rclcpp::SerializedMessage const>), rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (__functor=..., __args#0=...)
    at /usr/include/c++/11/bits/std_function.h:290
#21 0x000055555558230b in std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (this=0x7fffb41a02a0, __args#0=std::shared_ptr<const rclcpp::SerializedMessage> (empty) = {...}) at /usr/include/c++/11/bits/std_function.h:590
#22 0x0000555555601ce0 in std::__invoke_impl<void, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage> >(std::__invoke_other, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__f=...)
    at /usr/include/c++/11/bits/invoke.h:61
#23 0x00005555555f3b3c in std::__invoke_r<void, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage> >(std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__fn=...)
    at /usr/include/c++/11/bits/invoke.h:111
#24 0x00005555555eb1a6 in std::_Function_handler<void (std::shared_ptr<rclcpp::SerializedMessage>), std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)> >::_M_invoke(std::_Any_data const&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__functor=..., __args#0=...)
    at /usr/include/c++/11/bits/std_function.h:290
#25 0x00007ffff7eb5069 in rclcpp::GenericSubscription::handle_serialized_message(std::shared_ptr<rclcpp::SerializedMessage> const&, rclcpp::MessageInfo const&)
    () from /opt/ros/humble/lib/librclcpp.so
#26 0x00007ffff7ea96ea in rclcpp::Executor::execute_subscription(std::shared_ptr<rclcpp::SubscriptionBase>) () from /opt/ros/humble/lib/librclcpp.so
--Type <RET> for more, q to quit, c to continue without paging--
#27 0x00007ffff7eaa08f in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /opt/ros/humble/lib/librclcpp.so
#28 0x00007ffff7eb134a in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) () from /opt/ros/humble/lib/librclcpp.so
#29 0x00007ffff7eb1785 in rclcpp::executors::MultiThreadedExecutor::spin() () from /opt/ros/humble/lib/librclcpp.so
#30 0x00005555555c039c in main (argc=8, argv=0x7fffffffaab8) at /home/ubuntu/colcon_ws/src/rws/src/server_node.cpp:364
(gdb) 

Second run:

GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/ubuntu/colcon_ws/install/rws/lib/rws/rws_server...
(gdb) run
Starting program: /home/ubuntu/colcon_ws/install/rws/lib/rws/rws_server --ros-args -p rosbridge_compatible:=True -p port:=5001 -p watchdog:=True
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff65ff640 (LWP 29336)]
[New Thread 0x7ffff5dfe640 (LWP 29337)]
[New Thread 0x7ffff55fd640 (LWP 29338)]
[New Thread 0x7ffff4dfc640 (LWP 29339)]
[New Thread 0x7ffff45fb640 (LWP 29340)]
[New Thread 0x7ffff3dfa640 (LWP 29341)]
[New Thread 0x7ffff35f9640 (LWP 29342)]
[New Thread 0x7ffff2df8640 (LWP 29343)]
2024-10-11 14:08:49.242 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port29161: open_and_lock_file failed -> Function open_port_internal
[New Thread 0x7ffff25f7640 (LWP 29344)]
[New Thread 0x7ffff1df6640 (LWP 29345)]
[New Thread 0x7ffff15f5640 (LWP 29346)]
[INFO] [1728648529.284943184] [rws_server]: RWS start listening on port 5001
[New Thread 0x7ffff0c5f640 (LWP 29347)]
[New Thread 0x7fffd3fff640 (LWP 29348)]
[New Thread 0x7fffd37fe640 (LWP 29349)]
[New Thread 0x7fffd2ffd640 (LWP 29350)]
[New Thread 0x7fffd27fc640 (LWP 29351)]
[New Thread 0x7fffd1ffb640 (LWP 29352)]
[New Thread 0x7fffd17fa640 (LWP 29353)]
[New Thread 0x7fffd0ff9640 (LWP 29354)]
[New Thread 0x7fffd07f8640 (LWP 29355)]
[New Thread 0x7fffcfff7640 (LWP 29356)]
[New Thread 0x7fffcf7f6640 (LWP 29357)]
[New Thread 0x7fffceff5640 (LWP 29358)]
[New Thread 0x7fffce7f4640 (LWP 29359)]
[New Thread 0x7fffcdff3640 (LWP 29360)]
[New Thread 0x7fffcd7f2640 (LWP 29361)]
[New Thread 0x7fffccff1640 (LWP 29362)]
[New Thread 0x7fffcc7f0640 (LWP 29363)]
[New Thread 0x7fffcbfef640 (LWP 29364)]
[New Thread 0x7fffcb7ee640 (LWP 29365)]
[New Thread 0x7fffcafed640 (LWP 29366)]
[New Thread 0x7fffca7ec640 (LWP 29367)]
[2024-10-11 14:08:49] [connect] WebSocket Connection [::1]:35566 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728648529.949089360] [client_handler_0]: Constructing client 0(8258181446547256802)
[2024-10-11 14:08:49] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2024-10-11 14:08:49] [disconnect] Disconnect close local:[1006,End of File] remote:[1001]
[INFO] [1728648529.979025622] [rws_server]: Closing connection with client_id 0
[INFO] [1728648529.979121822] [client_handler_0]: Destroying client 0(8258181446547256802)
[2024-10-11 14:08:50] [connect] WebSocket Connection [::1]:35568 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728648530.041019672] [client_handler_1]: Constructing client 1(8258181446547256802)
[ERROR] [1728648530.050238218] [client_handler_1]: Failed to subscribe to topic: "Topic /vizanti/tf_consolidated not found"
[ERROR] [1728648530.193896079] [client_handler_1]: Failed to subscribe to topic: "Topic /received_global_plan not found"
[ERROR] [1728648530.224401862] [client_handler_1]: Failed to subscribe to topic: "Topic /goal_pose not found"
[2024-10-11 14:08:52] [error] handle_read_frame error: asio.system:104 (Connection reset by peer)
[2024-10-11 14:08:52] [disconnect] Disconnect close local:[1006,Connection reset by peer] remote:[1001]
[2024-10-11 14:08:52] [[INFO] [1728648532.968274037] [rws_server]: Closing connection with client_id 1
info] asio async_write error: asio.system:32 (Broken pipe)
[2024-10-11 14:08:52] [fatal] handle_write_frame error: websocketpp.transport:2 (Underlying Transport Error)
[INFO] [1728648532.968341082] [client_handler_1]: Destroying client 1(8258181446547256802)
[2024-10-11 14:08:53] [connect] WebSocket Connection [::1]:35572 v13 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" / 101
[INFO] [1728648533.011597620] [client_handler_2]: Constructing client 2(8258181446547256802)

Thread 29 "rws_server" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffcc7f0640 (LWP 29363)]
0x0000555555578830 in std::_Function_base::_M_empty (this=0x7fffbc131) at /usr/include/c++/11/bits/std_function.h:247
247         bool _M_empty() const { return !_M_manager; }
(gdb) bt
#0  0x0000555555578830 in std::_Function_base::_M_empty (this=0x7fffbc131) at /usr/include/c++/11/bits/std_function.h:247
#1  0x000055555557f7ba in std::function<void (std::vector<unsigned char, std::allocator<unsigned char> >&)>::operator bool() const (this=0x7fffbc131) at /usr/include/c++/11/bits/std_function.h:573
#2  0x00005555555705ea in rws::ClientHandler::send_message (this=0x7fffbc0e1, msg=std::vector of length 5966, capacity 9600 = {...}) at /home/ubuntu/colcon_ws/src/rws/src/client_handler.cpp:120
#3  0x0000555555570d05 in operator() (__closure=0x7fffbc0e18b0, message=std::shared_ptr<const rclcpp::SerializedMessage> (empty) = {...}) at /home/ubuntu/colcon_ws/src/rws/src/client_handler.cpp:179
#4  0x00005555555782da in std::__invoke_impl<void, rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)>&, std::shared_ptr<const rclcpp::SerializedMessage> >(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#5  0x0000555555577f4d in std::__invoke_r<void, rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)>&, std::shared_ptr<const rclcpp::SerializedMessage> >(struct {...} &) (__fn=...) at /usr/include/c++/11/bits/invoke.h:111
#6  0x0000555555577a8c in std::_Function_handler<void(std::shared_ptr<const rclcpp::SerializedMessage>), rws::ClientHandler::subscribe_to_topic(const json&, rws::json&)::<lambda(std::shared_ptr<const rclcpp::SerializedMessage>)> >::_M_invoke(const std::_Any_data &, std::shared_ptr<rclcpp::SerializedMessage const> &&) (__functor=..., __args#0=...) at /usr/include/c++/11/bits/std_function.h:290
#7  0x000055555558230b in std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (this=0x7fffbc097b70, 
    __args#0=std::shared_ptr<const rclcpp::SerializedMessage> (empty) = {...}) at /usr/include/c++/11/bits/std_function.h:590
#8  0x0000555555580c8f in rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (__closure=0x7fffbc044990, 
    message=std::shared_ptr<const rclcpp::SerializedMessage> (use count 2, weak count 0) = {...}) at /home/ubuntu/colcon_ws/src/rws/include/rws/connector.hpp:69
#9  0x00005555555a7d84 in std::__invoke_impl<void, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const> >(std::__invoke_other, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#10 0x000055555559fe91 in std::__invoke_r<void, rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const> >(rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (
    __fn=...) at /usr/include/c++/11/bits/invoke.h:111
#11 0x0000555555594c27 in std::_Function_handler<void (std::shared_ptr<rclcpp::SerializedMessage const>), rws::Connector<rclcpp::GenericPublisher>::subscribe_to_topic(unsigned short, rws::topic_params&, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>)::{lambda(std::shared_ptr<rclcpp::SerializedMessage const>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<rclcpp::SerializedMessage const>&&) (__functor=..., __args#0=...) at /usr/include/c++/11/bits/std_function.h:290
#12 0x000055555558230b in std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>::operator()(std::shared_ptr<rclcpp::SerializedMessage const>) const (this=0x7fffbc0f64f0, 
    __args#0=std::shared_ptr<const rclcpp::SerializedMessage> (empty) = {...}) at /usr/include/c++/11/bits/std_function.h:590
#13 0x0000555555601ce0 in std::__invoke_impl<void, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage> >(std::__invoke_other, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#14 0x00005555555f3b3c in std::__invoke_r<void, std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage> >(std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)>&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__fn=...) at /usr/include/c++/11/bits/invoke.h:111
#15 0x00005555555eb1a6 in std::_Function_handler<void (std::shared_ptr<rclcpp::SerializedMessage>), std::function<void (std::shared_ptr<rclcpp::SerializedMessage const>)> >::_M_invoke(std::_Any_data const&, std::shared_ptr<rclcpp::SerializedMessage>&&) (__functor=..., __args#0=...) at /usr/include/c++/11/bits/std_function.h:290
#16 0x00007ffff7eb5069 in rclcpp::GenericSubscription::handle_serialized_message(std::shared_ptr<rclcpp::SerializedMessage> const&, rclcpp::MessageInfo const&) () from /opt/ros/humble/lib/librclcpp.so
#17 0x00007ffff7ea96ea in rclcpp::Executor::execute_subscription(std::shared_ptr<rclcpp::SubscriptionBase>) () from /opt/ros/humble/lib/librclcpp.so
#18 0x00007ffff7eaa08f in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /opt/ros/humble/lib/librclcpp.so
#19 0x00007ffff7eb134a in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) () from /opt/ros/humble/lib/librclcpp.so
#20 0x00007ffff7adc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#21 0x00007ffff7694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#22 0x00007ffff7726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) 

I'm pretty new to using GDB, let me know if there's anything specific I ought to run to get the most useful data. I've also tried running with watchdog as False, but the result seems to be the same so it's probably not that.

v-kiniv commented 1 month ago

Thanks for finding a way to reliably reproduce the issue. The issue itself is very weird and was tricky to understand exactly what was going wrong. Going up the call stack, it's obvious that the segmentation fault is caused by ClientHandler::send_message trying to read the object's memory, but this pointer points to 0x0 or some random address, even though the ClientHandler object is still alive, not destructed, as expected because websocket client is still connected. What I found out, is that lambda function inside ClientHandler::subscribe_to_topic after a certain number of calls will corrupt the stack and the variables(among which is this for ClientHandler) that were captured during lambda creation will be lost. I'm not sure why this is happening. But I replaced lambda function with a regular function + std::bind and it doesn't crash, at least I can't reproduce it anymore. By the way, the crash also happens on Jazzy, so I created a PR https://github.com/v-kiniv/rws/pull/40 for the main branch, and will backport it later to other versions after merge.

MoffKalast commented 1 month ago

Ah wow thanks for looking into this so quickly, I was able to test it today and I can confirm it works, at least by cherry picking the commit on the humble branch I can no longer crash it.

I'm not sure why but I can't seem to reproduce the original issue on jazzy at all. Might be something to do with non-local latency since I'm testing on separate machines there.

But overall seems good, I think we can merge :+1:

I replaced lambda function with a regular function + std::bind and it doesn't crash

They say the devil is in the details, and lambdas hide the details, ergo... :stuck_out_tongue:

v-kiniv commented 1 month ago

They say the devil is in the details, and lambdas hide the details, ergo... 😛

Sounds reasonable)