stereolabs / zed-ros-wrapper

ROS wrapper for the ZED SDK
https://www.stereolabs.com/docs/ros/
MIT License
447 stars 391 forks source link

decreasing grab framerate to 30 fps -> unstable multi GMSL cameras start #894

Closed bmegli closed 1 year ago

bmegli commented 1 year ago

Preliminary Checks

Description

Related to GMSL cameras (ZED-X, ZED-XM)

https://github.com/stereolabs/zed-ros-wrapper/blob/50dda077a5ea72b7f8c00faad5746938331e46c2/zed_wrapper/params/zedx.yaml#L8

Decreasing grab_frame_rate to 30 fps makes starting multiple GMSL cameras at the same time unstable.

Steps to Reproduce

See Anything Else section for now

Expected Result

Changing grab_frame_rate not affecting cameras startup stability.

Actual Result

Decreasing grab_frame_rate to 30 fps makes starting multiple GMSL cameras at the same time unstable.

Starting only 1 camera works as expected.

Waiting for first camera to finish init before second camera helps a bit but is still hit or miss

When both cameras start then they work reliably, it is only the start that is affected.

Warnings

Camera 1 (eventually succeeds)

[ZED-Argus][Timeout] CAM 0 is frozen
[ZED-Argus][Timeout] CAM 0 is frozen
(Argus) Error FileOperationFailed: Failed socket read: Connection reset by peer (in src/rpc/socket/common/SocketUtils.cpp, function readSocket(), line 79)
(Argus) Error FileOperationFailed: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error FileOperationFailed: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 350)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
(Argus) Error FileOperationFailed: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error FileOperationFailed: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error FileOperationFailed:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)

Camera 2 (eventually fails)

(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)

ZED Camera model

ZED-X, ZED-XM


Note - impossible to select ZED-X/ZED-XM, in the form, edited afterwards


Environment

Both are running with Neural depth

Jetson AGX Orin

apt-cache policy stereolabs-nvidia-l4t-kernel-35.2-dtbs
stereolabs-nvidia-l4t-kernel-35.2-dtbs:
  Installed: 5.10.104-tegra-35.2.1-20230124153320
  Candidate: 5.10.104-tegra-35.2.1-20230124153320

Resulting from

sudo apt install /usr/local/zed/drivers/L4T_35.2/stereolabs-zedx-L4T35.2-v0.4.7_max96712.deb 

Anything else?

Workaround

Keep grab_frame_rate at 60 fps

Other notes

ZED_Depth_Viewer

It is possible to trigger similar condition with 2x ZED_Depth_Viewer + point at different GMSL cameras + neural depth + playing with framerate (which restarts the cameras)

So the real problem is below ROS layer.

First

ZED_Depth_Viewer 
[ZED-Argus][Timeout] CAM 1 is frozen
[ZED-Argus][Timeout] CAM 1 is frozen
(Argus) Error FileOperationFailed: Failed socket read: Connection reset by peer (in src/rpc/socket/common/SocketUtils.cpp, function readSocket(), line 79)
(Argus) Error FileOperationFailed: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error FileOperationFailed: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 350)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
(Argus) Error FileOperationFailed: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error FileOperationFailed: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error FileOperationFailed:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
[ZED-Argus][Timeout] CAM 1 is frozen
[ZED-Argus][Timeout] CAM 1 is frozen

Second

ZED_Depth_Viewer 
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
Stack trace (most recent call last):
#26   Object "ZED_Depth_Viewer", at 0x41ed6f, in 
#25   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffa021ae0f, in __libc_start_main
#24   Object "ZED_Depth_Viewer", at 0x41e2fb, in 
#23   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa0893a5b, in QCoreApplication::exec()
#22   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa088b3b7, in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)
#21   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e81cb, in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)
#20   Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36c53, in g_main_context_iteration
#19   Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36bb3, in 
#18   Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36943, in g_main_context_dispatch
#17   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e7e37, in 
#16   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e7507, in QTimerInfoList::activateTimers()
#15   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa088cc0b, in QCoreApplication::notifyInternal2(QObject*, QEvent*)
#14   Object "/usr/lib/aarch64-linux-gnu/libQt5Widgets.so.5", at 0xffffa1245ad7, in QApplication::notify(QObject*, QEvent*)
#13   Object "/usr/lib/aarch64-linux-gnu/libQt5Widgets.so.5", at 0xffffa123c4ab, in QApplicationPrivate::notify_helper(QObject*, QEvent*)
#12   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08ba5b7, in QObject::event(QEvent*)
#11   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08c7557, in QTimer::timeout(QTimer::QPrivateSignal)
#10   Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08b9bff, in QMetaObject::activate(QObject*, int, int, void**)
#9    Object "ZED_Depth_Viewer", at 0x41f35b, in 
#8    Object "ZED_Depth_Viewer", at 0x43fe83, in 
#7    Object "ZED_Depth_Viewer", at 0x43fafb, in 
#6    Object "ZED_Depth_Viewer", at 0x437a03, in 
#5    Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa355f857, in sl::Camera::open(sl::InitParameters)
#4    Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa35cc56b, in 
#3    Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a47a53, in sl::GMSLInput::close(bool)
#2    Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a3d573, in ArgusCamera::close()
#1    Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a454bf, in 
#0    Object "/usr/lib/aarch64-linux-gnu/tegra/libnvargus_socketclient.so", at 0xffff9fad0810, in 
Segmentation fault (Address not mapped to object [(nil)])
Segmentation fault (core dumped)
bmegli commented 1 year ago

After installing ZED SDK 4.0.3 I can no longer reproduce this problem from ROS side.

I am not sure it is SDK or GMSL grabber driver or something else that fixed the problem.

If I don't see it again soon I will close the issue.

Myzhar commented 1 year ago

When one of the cameras is not reachable it is possible that the argus service is frozen for some reason. You can recover the cameras by restarting the service: sudo service nvargus-daemon restart

bmegli commented 1 year ago

Thanks.

I can no longer reproduce the problem also with ZED_Depth_Viewer

So somehow installing ZED SDK 4.0.3 or GMSL driver fixed it