NVIDIA-ISAAC-ROS / isaac_ros_apriltag

NVIDIA-accelerated Apriltag detection and pose estimation.
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
107 stars 19 forks source link

Cuda Error after I change output size to 1280x720. #18

Closed lymnxn closed 1 year ago

lymnxn commented 1 year ago

Hello: I use a IMX219 Camera, it can output in 1280x720, So I change rectify_node output size to 1280x720, and get error:

[component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:97 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_gradient_x2, sizeof(float) gradient_w gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:98 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_gradient_y2, sizeof(float) gradient_w gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:99 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_gradient_xy, sizeof(float) gradient_w gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:100 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_blur_gradient_x2, sizeof(float) blur_gradient_w blur_gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:101 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_blur_gradient_y2, sizeof(float) blur_gradient_w blur_gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:102 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void)&d_blur_gradient_xy, sizeof(float) blur_gradient_w blur_gradient_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:104 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void*)&d_maxima_suppression, sizeof(bool) corners_w * corners_h)" [component_container_mt-1] CUDA error at external/libapriltag/april_tagging/corner_detection/corner_detector.cpp:187 code=700(cudaErrorIllegalAddress) "cudaStreamWaitEvent(x2_stream, sync_event_x2, 0)"

hemalshahNV commented 1 year ago

What platform are you running this on (x86_64 w/ GPU or Jetson, in or outside of Isaac ROS Dev base container)? At first glance, it seems AprilTag ran out of GPU memory. Have you tried at smaller resolutions just to see if it works at all?

lymnxn commented 1 year ago

What platform are you running this on (x86_64 w/ GPU or Jetson, in or outside of Isaac ROS Dev base container)? At first glance, it seems AprilTag ran out of GPU memory. Have you tried at smaller resolutions just to see if it works at all?

I use on Jetson Xavier NX 8G Version, CSI IMX219 Camera. In the container.

hemalshahNV commented 1 year ago

Is there anything else running with the Apriltag node that could be eating up GPU memory perhaps?

nakai-omer commented 1 year ago

@hemalshahNV We are also experiencing this issue with similar camera:

[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:84 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_normalized_input_image, size_t(sizeof(float) * image_w * image_h))" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:97 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_gradient_x2, sizeof(float) * gradient_w * gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:98 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_gradient_y2, sizeof(float) * gradient_w * gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:99 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_gradient_xy, sizeof(float) * gradient_w * gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:100 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_blur_gradient_x2, sizeof(float) * blur_gradient_w * blur_gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:101 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_blur_gradient_y2, sizeof(float) * blur_gradient_w * blur_gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:102 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_blur_gradient_xy, sizeof(float) * blur_gradient_w * blur_gradient_h)" 
[component_container_mt-1] CUDA error at /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/corner_detection/corner_detector.cpp:104 code=2(cudaErrorMemoryAllocation) "cudaMalloc((void**)&d_maxima_suppression, sizeof(bool) * corners_w * corners_h)" 
[component_container_mt-1] component_container_mt: /workspaces/isaac_ros-dev/lib/src/nvapriltags/april_tagging/april_tag.cpp:20: void AprilTags::DetectAndDecodeAprilTags(const DeviceDataView<uchar3>&, cuAprilTagsID_t*, SizeT*, SizeT, cudaStream_t): Assertion `input_image.size_.x == w && input_image.size_.y == h' failed.

After a reboot we get:

[component_container_mt-1] 2023-07-17 14:30:22.939 ERROR /workspaces/isaac_ros-dev/src/isaac_ros_image_pipeline/isaac_ros_image_proc/gxf/tensorops/extensions/tensorops/components/ImageUtils.cpp@117: invalid distortion type.
[component_container_mt-1] 2023-07-17 14:30:22.939 ERROR /workspaces/isaac_ros-dev/src/isaac_ros_image_pipeline/isaac_ros_image_proc/gxf/tensorops/extensions/tensorops/components/TensorOperator.cpp@233: operation failed.
[component_container_mt-1] 2023-07-17 14:30:22.943 ERROR gxf/std/entity_executor.cpp@509: Failed to tick codelet undistort_algo in entity: INMNALGWED_rectifier code: GXF_FAILURE
[component_container_mt-1] 2023-07-17 14:30:22.944 ERROR gxf/std/entity_executor.cpp@540: Entity [INMNALGWED_rectifier] must be in Lifecycle::kStarted or Lifecycle::kIdle stage before stopping. Current state is Ticking
[component_container_mt-1] 2023-07-17 14:30:22.944 WARN  gxf/std/multi_thread_scheduler.cpp@235: Error while executing entity E74 named 'INMNALGWED_rectifier': GXF_FAILURE

Trying to run again yields:

[component_container_mt-1] 2023-07-17 14:34:06.645 ERROR extensions/hawk/argus_camera.cpp@320: Failed to get CaptureSession interface
[component_container_mt-1] 2023-07-17 14:34:06.645 ERROR extensions/hawk/argus_camera.cpp@651: Error setting up output streams
[component_container_mt-1] 2023-07-17 14:34:06.645 ERROR gxf/std/entity_executor.cpp@540: Entity [DVVNTFMRQP_argus_camera] must be in Lifecycle::kStarted or Lifecycle::kIdle stage before stopping. Current state is StartPending
[component_container_mt-1] 2023-07-17 14:34:06.645 WARN  gxf/std/greedy_scheduler.cpp@241: Error while executing entity 21 named 'DVVNTFMRQP_argus_camera': GXF_FAILURE
[component_container_mt-1] 2023-07-17 14:34:06.751 ERROR gxf/std/entity_executor.cpp@203: Entity with eid 44 not found!

Something doesn't seem stable here.