Closed daisukes closed 4 years ago
Since it segfaults in malloc
could it be that you are running out of memory when this happens?
No. I tried it again. I have 24GB memory and the system is using only 7GB in total when I got the error again. I'm running the ROS2 (Ubuntu20.04) as a docker container on a Ubuntu16.04 host. Do you think it could be a problem? I didn't set any memory limits for docker containers.
@dirk-thomas Thanks to the hint, I might fix the issue by increasing the stack size. It has been running over 20 minutes so far. Does it make sense to you?
$ ulimit -s
8192
$ ulimit -s 65536
Since 8192 is the default on Ubuntu and with that configuration it works for many developers, users as well as our CI infrastructure I don't think it should be necessary.
A debug build and using gdb
to look into how much memory the failing invocation tries to allocate might help to narrow down the problem.
I tried a debug build. Then somehow, I could not reproduce the segmentation fault both debug and release build.
While building, my system used all disks and my docker environment was broken. So I had to clean all images and build from scratch. My docker images might be something wrong.
If I get the same error again, I will be back here, but this issue can be closed I think. Thank you for your help!
Bug report
It looks like something wrong in deserializing LaserScan message once per thousands. It happens in 30 secs to several mins from launch randomly and /scan message is published in 10hz. I have been working on my ROS1 system with ROS2 navigation. ROS1+Gazebo simulate Velodyne lidar and convert it to LaserSacn and transfer it to ROS2 via a bridge.
I'm not sure that this is a bug here or my configuration with the bridge. Could you help me to fix this?
Required Info:
Steps to reproduce issue
It is difficult because my project is not public yet, but here is the backtrace of the process. This could happen with the process which subscribing to the /scan message.