NVIDIA-ISAAC-ROS / isaac_ros_image_pipeline

NVIDIA-accelerated ROS2 packages for camera image processing.
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
121 stars 25 forks source link

VPI_ERROR_OUT_OF_MEMORY #12

Closed Finn2708 closed 2 years ago

Finn2708 commented 2 years ago

Hello,

I ran into the issue already mentioned here. Full error message is:

ERROR: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation

I tried on multiple x86_64 machines with different GPUs (RTX2060S, RTX3070) running Ubuntu 20.04 LTS. Docker container is pulled from NVIDIA-ISAAC-ROS/isaac_ros_common (Docker Version 20.10.15).

I started the default pipeline like this: ros2 run usb_cam usb_cam_node_exe --ros-args --params-file camera_params.yaml (Webcam, 1280x960 @ 30fps, calibrated) ros2 run isaac_ros_image_proc isaac_ros_image_proc

Before subscribing to the image_rect topic, VRAM usage is pretty stable. nvidia-smi (on the RTX2060S system) returns:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:2B:00.0  On |                  N/A |
| 29%   41C    P2    38W / 175W |    826MiB /  7973MiB |     13%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1204      G   /usr/lib/xorg/Xorg                 65MiB |
|    0   N/A  N/A      1976      G   /usr/lib/xorg/Xorg                211MiB |
|    0   N/A  N/A      2104      G   /usr/bin/gnome-shell               35MiB |
|    0   N/A  N/A      2511      G   /usr/lib/firefox/firefox          232MiB |
|    0   N/A  N/A     12497      G   ...RendererForSitePerProcess       37MiB |
|    0   N/A  N/A     17581      C   ...proc/isaac_ros_image_proc      109MiB |
+-----------------------------------------------------------------------------+

Roughly 60 seconds after subscribing to image_rect with ros2 topic echo /image_rect, VPI fails to allocate the required memory:

[ERROR] [1652183129.777531821] [rectify_mono]: Error while rectifying image: /workspaces/isaac_ros-dev/colcon_ws/src/isaac_ros_image_pipeline/isaac_ros_image_proc/src/rectify_node.cpp:305: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation
[ERROR] [1652183129.808515730] [rectify_mono]: Error while rectifying image: /workspaces/isaac_ros-dev/colcon_ws/src/isaac_ros_image_pipeline/isaac_ros_image_proc/src/rectify_node.cpp:305: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation
[WARN] [1652183129.851698683] [image_format_mono]: Exception: /workspaces/isaac_ros-dev/colcon_ws/src/isaac_ros_image_pipeline/isaac_ros_image_proc/src/image_format_converter_node.cpp:67: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation
[INFO] [1652183129.851795993] [image_format_mono]: Attempting conversion using OpenCV
[ERROR] [1652183129.858646485] [rectify_mono]: Error while rectifying image: /workspaces/isaac_ros-dev/colcon_ws/src/isaac_ros_image_pipeline/isaac_ros_image_proc/src/rectify_node.cpp:305: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation
[ERROR] [1652183129.877362850] [rectify_mono]: Error while rectifying image: /workspaces/isaac_ros-dev/colcon_ws/src/isaac_ros_image_pipeline/isaac_ros_image_proc/src/rectify_node.cpp:305: VPI_ERROR_OUT_OF_MEMORY: Not enough space for resource allocation
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:2B:00.0  On |                  N/A |
| 29%   41C    P2    39W / 175W |   7947MiB /  7973MiB |     31%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1204      G   /usr/lib/xorg/Xorg                 65MiB |
|    0   N/A  N/A      1976      G   /usr/lib/xorg/Xorg                223MiB |
|    0   N/A  N/A      2104      G   /usr/bin/gnome-shell               70MiB |
|    0   N/A  N/A      2511      G   /usr/lib/firefox/firefox          233MiB |
|    0   N/A  N/A     12497      G   ...RendererForSitePerProcess       37MiB |
|    0   N/A  N/A     17581      C   ...proc/isaac_ros_image_proc      119MiB |
+-----------------------------------------------------------------------------+

On the RTX2060S, the pipeline stops rectifying most frames but publishes the non-rectified frames instead, while on the RTX3070 the pipeline crashes after a few VPI_ERROR_OUT_OF_MEMORY messages. I have attached the logs of the RTX2060S setup: isaac_ros_image_proc_95128_1652183865298.log usb_cam_node_exe_95110_1652183858680.log

After cancelling the pipeline, the memory is freed immediately.

By commenting out the nodes of the provided pipeline one-by-one, I believe the issue stems from the rectify_node. I tried pinpointing the issue further, but unfortunately I'm not too familiar with the VPI framework myself.

I can provide more info if required, but I'm not sure what would be useful. I also have an old Quadro P2200 available that I haven't tested yet, but would be able to if that were of any help.

Kind regards, Finn

hemalshahNV commented 2 years ago

Thanks for the report! Our SQA has been able to reproduce the problem and we're qualifying a fix for this. We'll publish it as a hotfix as soon as trials are complete.

Finn2708 commented 2 years ago

Thanks for the quick fix, I will test this on monday!

Finn2708 commented 2 years ago

I can confirm that this fixed my issue. Thanks!