gazebosim / gz-launch

Run and manage programs and plugins.
https://gazebosim.org
Apache License 2.0
11 stars 15 forks source link

Websocket segfault when quickly opening and closing connections #60

Open AlejoAsd opened 4 years ago

AlejoAsd commented 4 years ago

The Websocket server segfaults if a websocket connection is opened and closed in quick succession.

Steps to reproduce

  1. Start a simulation with the websocket server enabled. (My specific tests were performed using Cloudsim.)
  2. Quickly connect and disconnect from the simulation.

Stack trace

Stack trace (most recent call last) in thread 65:
#13   Object "", at 0xffffffffffffffff, in 
#12   Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f326a152a3e, in clone
#11   Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f3269e196da, in start_thread
#10   Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f32670b96de, in std::error_code::default_error_condition() const
#9    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265faac87, in ignition::launch::WebsocketServer::Run()
#8    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d8b55a, in lws_SHA1
#7    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d80015, in lws_service_fd_tsi
#6    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d7cd41, in lws_read
#5    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d8ea29, in lws_serve_http_file
#4    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d83d9e, in lws_frame_is_binary
#3    Object "/usr/lib/x86_64-linux-gnu/libwebsockets.so.8", at 0x7f3265d7e7e6, in lws_close_reason
#2    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265faefcc, in rootCallback(lws*, lws_callback_reasons, void*, void*, unsigned long)
#1    Object "/usr/lib/x86_64-linux-gnu/ign-launch-1/plugins/libignition-launch-websocket-server.so", at 0x7f3265fae6e1, in ignition::launch::WebsocketServer::OnMessage(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#0    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f326a1bbfde, in __nss_passwd_lookup
Segmentation fault (Address not mapped to object [0xa6d3000])
./run_sim.bash: line 16:    54 Segmentation fault      (core dumped) ign launch -v 4 $@
ruffsl commented 1 year ago

I'm also seeing a similar segfault when attempting to open a connection at all, without even quickly cycling the connection:

$ gz launch --versions
6.0.0

/usr/share/gz/gz-launch6/configs$ gz launch websocket.gzlaunch -v 4
[Dbg] [Manager.cc:1164] Loading plugin. Name[gz::launch::WebsocketServer] File[gz-launch-websocket-server]
[Dbg] [WebsocketServer.cc:414] Using port[9002]
[Dbg] [WebsocketServer.cc:429] Using maximum connection count of -1
[Wrn] [WebsocketServer.cc:559] Partial SSL configuration specified. Please specify:     <ssl>
      <cert_file>PATH_TO_CERT_FILE</cert_file>
      <private_key_file>PATH_TO_KEY_FILE</private_key_file>
    </ssl>.
Continuing without SSL.
[Dbg] [WebsocketServer.cc:246] LWS_CALLBACK_ESTABLISHED
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:729] Protos request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:784] Topic and message type list request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
[Dbg] [WebsocketServer.cc:814] World info request received
[Dbg] [WebsocketServer.cc:301] LWS_CALLBACK_RECEIVE
Stack trace (most recent call last) in thread 50526:
#16   Object "", at 0xffffffffffffffff, in 
#15   Source "./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 81, in __clone3 [0x7fc91c3269ff]
#14   Source "./nptl/pthread_create.c", line 442, in start_thread [0x7fc91c294b42]
#13   Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c6dc2b2, in std::error_code::default_error_condition() const
#12   Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9def4c, in gz::launch::WebsocketServer::Run()
#11   Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8cbfb6, in lws_service
#10   Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8ec979, in _lws_plat_file_open
#9    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8ec6ea, in _lws_plat_file_open
#8    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8c9808, in lws_service_fd_tsi
#7    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8d66bd, in lws_hdr_custom_copy
#6    Object "/lib/x86_64-linux-gnu/libwebsockets.so.16", at 0x7fc91b8d5b3c, in lws_hdr_custom_copy
#5    Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9ea44a, in rootCallback(lws*, lws_callback_reasons, void*, void*, unsigned long)
#4    Object "/usr/lib/x86_64-linux-gnu/gz-launch-6/plugins/libgz-launch-websocket-server.so", at 0x7fc91c9e763d, in gz::launch::WebsocketServer::OnMessage(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
#3    Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c73cb34, in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
#2    Object "/lib/x86_64-linux-gnu/libgz-common5.so.5", at 0x7fc91c970955, in gz::common::Logger::Buffer::xsputn(char const*, long)
#1    Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7fc91c74a72d, in std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long)
#0    Source "./string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S", line 317, in __memcpy_avx_unaligned_erms [0x7fc91c3a094d]
Segmentation fault (Address not mapped to object [(nil)])
Segmentation fault (core dumped)

This is simply when using the visualization app hosted from the gazebosim site:

Core dump with crash file:

_usr_lib_x86_64-linux-gnu_gz_launch6_gz-launch.1000.zip

System info:

> Ubuntu 22.04 ``` $ apt info libgz-launch6-dev Package: libgz-launch6-dev Version: 6.0.0-1~jammy Priority: optional Section: libdevel Source: gz-launch6 Maintainer: Jose Luis Rivero Installed-Size: 97.3 kB Depends: libgz-cmake3-dev, libgz-common5-dev, libgz-sim7-dev, libgz-gui7-dev, libgz-msgs9-dev, libgz-plugin2-dev, libgz-tools2-dev, libgz-transport12-dev, libsdformat13-dev, libtinyxml2-dev, libwebsockets-dev, qtquickcontrols2-5-dev, libqt5core5a, libgz-launch6 (= 6.0.0-1~jammy) Breaks: libignition-launch6-dev (<< 5.999.999+nightly+git20220630+2rcec9c00a42bbd412815a3c9d64a3ce9b7dfd186d-2) Replaces: libignition-launch6-dev (<< 5.999.999+nightly+git20220630+2rcec9c00a42bbd412815a3c9d64a3ce9b7dfd186d-2) Homepage: https://github.com/gazebosim/gz-launch Download-Size: 16.2 kB APT-Manual-Installed: no APT-Sources: http://packages.osrfoundation.org/gazebo/ubuntu-stable jammy/main amd64 Packages Description: Gazebo Launch Library - Development files Gazebo Launch, a component of Gazebo, provides a command line interface to run and manager application and plugins. . Package contains the Gazebo launch development files and cli client ```
usedhondacivic commented 6 months ago

Found something interesting regarding this bug. I am able to reproduce it using docker with this Dockerfile:

ARG ROS_VERSION=humble

FROM ros:$ROS_VERSION

RUN apt-get update && apt-get install -y --no-install-recommends wget curl

ARG GAZEBO_VERSION=garden

RUN wget https://packages.osrfoundation.org/gazebo.gpg -O /usr/share/keyrings/pkgs-osrf-archive-keyring.gpg && \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/pkgs-osrf-archive-keyring.gpg] http://packages.osrfoundation.org/gazebo/ubuntu-stable $(lsb_release -cs) main" | tee /etc/apt/sources.list.d/gazebo-stable.list > /dev/null && \
apt-get update && \
apt-get install -y --no-install-recommends gz-$GAZEBO_VERSION ros-$ROS_DISTRO-ros-gz$GAZEBO_VERSION

RUN curl -O https://raw.githubusercontent.com/gazebosim/gz-launch/main/examples/websocket.gzlaunch

CMD bash -c "gz sim -s -v 4 shapes.sdf & gz launch -v 4 websocket.gzlaunch"

And running this command: docker build -t gz_launch_bug . && docker run -it --network host gz_launch_bug

However, if I run with docker build -t gz_launch_bug . && docker run -it -p9002:9002 gz_launch_bug (ie, I expose the port instead of using --network host) I can connect just fine from the gazebo sim website visualizer.

I'm curious if @ruffsl was also using Docker / network host when he encountered the bug.

This is not a root cause of course, but could point someone more knowledgeable in the right direction.

ruffsl commented 6 months ago

I'm curious if @ruffsl was also using Docker / network host when he encountered the bug.

@usedhondacivic , I think I probably was using the host network interface, as I was mainly using dev containers for experimentation & semi isolation for this project:

Perhaps this as something to do with unusual differences in process namespace isolation in containers vs matching host names with host network interfaces throwing off ZeroMQ, similarly to what I've experienced with DDS and shared memory transport?