IntelRealSense / realsense-ros

ROS Wrapper for Intel(R) RealSense(TM) Cameras
http://wiki.ros.org/RealSense
Apache License 2.0
2.59k stars 1.76k forks source link

[D457] Significant CPU usage when streaming D457 #2908

Closed ryank-cobot closed 12 months ago

ryank-cobot commented 1 year ago

Hello,

We seem to be running into an issue when running the ros2 realsense node for our D457 cameras connected over GMSL. Each camera seems to take up at about 1 core of CPU which was unexpected for us. Our cameras are on firmware 5.15.1, our realsense-ros is version 4.54.1, and librealsense is version 2.54.2. The compute platform is a connecttech Anvil: https://www.wdlsystems.com/connect-tech-esg621-0qr.

Here is our top output

user@devkit2-hat:~/apollo$ top

top - 18:41:36 up 21:00,  6 users,  load average: 12.83, 7.09, 3.80
Tasks: 392 total,   7 running, 385 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.5 us, 49.2 sy,  0.0 ni, 47.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  62796.3 total,  34166.9 free,   2724.0 used,  25905.4 buff/cache
MiB Swap:  31398.1 total,  31398.1 free,      0.0 used.  58769.9 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                       
 265876 root      20   0       0      0      0 R 100.0   0.0   1:36.89 vi-output, DS5                                                                                
 266060 root      20   0       0      0      0 R 100.0   0.0   1:38.59 vi-output, DS5                                                                                
 266116 root      20   0       0      0      0 R 100.0   0.0   1:31.09 vi-output, DS5                                                                                
 266124 root      20   0       0      0      0 R 100.0   0.0   1:27.12 vi-output, DS5                                                                                
 266131 root      20   0       0      0      0 R 100.0   0.0   1:27.87 vi-output, DS5                                                                                
 265745 root      20   0       0      0      0 R  93.8   0.0   1:42.51 vi-output, DS5                                                                                
 265618 user     20   0  980316  69892  40580 S  18.8   0.1   0:11.98 realsense2_came                                                                               
 265614 user     20   0  980316  67436  40728 S   6.2   0.1   0:13.56 realsense2_came                                                                               
 265616 user     20   0  980316  68664  40168 S   6.2   0.1   0:12.91 realsense2_came                                                                               
 265620 user     20   0  980316  69436  40680 S   6.2   0.1   0:11.55 realsense2_came                                                                               
 266638 user     20   0   11836   3372   2648 R   6.2   0.0   0:00.01 top       

Please let me know what other information you need on my end and thanks so much for the help in advance.

Best, Ryan

MartyG-RealSense commented 1 year ago

Hi @ryank-cobot By default the librealsense SDK only uses a single CPU core.

It can utilize multiple cores when performing depth-color alignment (align_depth.enable) if the SDK is built from source code with CMake with the build flag -DBUILD_WITH_OPENMP=TRUE included in the CMake build instruction. If you are not using depth-color alignment then using this flag likely would not benefit you though.

ryank-cobot commented 1 year ago

Hi @MartyG-RealSense, thank you for the response. The issue I am seeing occurs with a single sensor taking up an entire core as well. I am mainly concerned that just running the sensor is taking almost an entire core that is shown by the vi-output, DS5 that also spawns when I start the realsense node. Is this amount of CPU usage expected when running a realsense D457 over GMSL?

user@devkit2-hat:~$ top

top - 15:43:32 up 19:37,  5 users,  load average: 1.44, 1.34, 5.25
Tasks: 363 total,   2 running, 361 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.4 us,  8.6 sy,  0.0 ni, 89.7 id,  0.0 wa,  0.1 hi,  0.2 si,  0.0 st
MiB Mem :  62796.3 total,  57662.1 free,   2156.6 used,   2977.5 buff/cache
MiB Swap:  31398.1 total,  31398.1 free,      0.0 used.  59432.8 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                       
 117898 root      20   0       0      0      0 R  97.4   0.0   0:25.37 vi-output, DS5                                                                                
 117868 user     20   0  773868  58452  35308 S  17.5   0.1   0:04.91 realsense2_came   
MartyG-RealSense commented 1 year ago

Is your application using depth-color alignment, pointcloud or post-processing filters, please?

ryank-cobot commented 1 year ago

Hi Marty,

Our application is not using depth-color alignment, pointcloud, or post processing filters.

I attached a D455 over USB and ran rs_launch.py and the CPU usage is normal. I believe there is an error with the the way librealsense is connecting to the GMSL camera which is causing this high CPU usage. Attached is a top output from using just the USB camera. You can see there is no vi-output process.

user@orin:~/apollo$ top

top - 14:02:00 up 7 days,  1:04,  2 users,  load average: 0.30, 0.51, 0.94
Tasks: 328 total,   1 running, 327 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  1.2 sy,  0.0 ni, 97.5 id,  0.1 wa,  0.4 hi,  0.2 si,  0.0 st
MiB Mem :  30587.4 total,  13839.4 free,   1535.0 used,  15213.0 buff/cache
MiB Swap:  15293.7 total,  15291.7 free,      2.0 used.  28625.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                                         
 331015 user     20   0   31.4g 112648  56428 S  11.3   0.4   0:06.08 realsense2_came
   330 root      39  19       0      0      0 S   4.3   0.0   0:02.31 nvmap-bz                                                                                                                                                                        
 331044 root       0 -20       0      0      0 I   1.7   0.0   0:00.22 kworker/u25:0-uvcvideo                                                                                                                                                          
   2197 root      20   0 1316424  90416  17732 S   1.3   0.3 105:48.63 tailscaled                                                                                                                                                                      
 331429 root       0 -20       0      0      0 I   1.3   0.0   0:00.29 kworker/u25:2-uvcvideo                                                                                                                                                          
    465 root      rt   0       0      0      0 S   0.7   0.0   0:34.99 sugov:0                         
MartyG-RealSense commented 1 year ago

I will highlight to my Intel RealSense colleagues on the ROS team your issue with maxed-out CPU percentage usage on a GMSL connection, whilst usage on a USB connection is much lower. Thanks very much for your patience!.

ryank-cobot commented 1 year ago

Awesome thank you so much @MartyG-RealSense. Looking forward to resolving this.

dmipx commented 1 year ago

Hi. Can you try to increase number of backend buffers SDK uses? src/platform/uvc-device.h

const uint8_t DEFAULT_V4L2_FRAME_BUFFERS = 4; to const uint8_t DEFAULT_V4L2_FRAME_BUFFERS = 8;

ryank-cobot commented 1 year ago

Hi @dmipx,

That file does not currently exist for us as we are using release v2.54.2. Should we try the development branch instead?

ryank-cobot commented 1 year ago

Hi @dmipx,

I was able to find that same variable in the src/backend.h file in the release we are using and I updated the value from 4 to 8. It did not change any of the performance however.

Thanks, Ryan

MartyG-RealSense commented 1 year ago

Hi @dmipx Could you continue to assist @ryank-cobot with their question above, please? Thanks!

ryank-cobot commented 1 year ago

Hi @MartyG-RealSense Thank you for the continued follow up. I am currently receiving support through the Realsense support channels outside of github and we are actively working through this issue. Thank you.

MartyG-RealSense commented 1 year ago

That's great to hear. Thanks very much!

MartyG-RealSense commented 12 months ago

Case closed due to receiving support through other RealSense channels.

mirolpe commented 9 months ago

Im having the same issue. Did you find a fix for this?

Lorite commented 7 months ago

Hi @MartyG-RealSense @ryank-cobot. I am facing the same issue. How did you solve it? Thank you in advance :)

Therkelsen commented 6 months ago

For anyone else facing this issue, here are some findings from Intel support:

  1. There is a known error handling issue with the Nvidia Jetpack OS on Orin. NVIDIA NULL vi patch: https://forums.developer.nvidia.com/t/orin-in-jp5-0-2-camera-vi-output-imx1-process-cpu-occupies-100/233703/26
  2. The issue appears to correlate when tegra-capture-vi err_data 262144 occurs (and possibly causes the CPU issue due to the error handling bug in the OS)
  3. The CPU usage is high, but it may not affect system performance (possibly due to its lower priority).
ryank-cobot commented 2 months ago

Wanted to follow up on this and just confirm that running our realsense sensors on Jetpack 6 with librealsense version 2.55.1 and realsense-ros 4.55.1 no longer have this issue in case anyone else faces this.

MartyG-RealSense commented 2 months ago

Thanks so much @ryank-cobot for the feedback about a resolved problem in your particular situation!