NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
104 stars 17 forks source link

error running examples #1

Closed FPSychotic closed 2 years ago

FPSychotic commented 2 years ago

Hi, I installed the issac ros from source, without docker, when I run examples, at the moment, trt and unet lauch.py files I get errors of gfx and some missing files.

ros2 launch ./isaac_ros_tensor_rt.py
[INFO] [launch]: All log files can be found below /home/imother/.ros/log/2021-11-14-20-14-44-133478-imother-799864
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [component_container-1]: process started with pid [800292]
[component_container-1] [INFO] [1636920885.332517312] [isaac_ros_tensor_rt.tensor_rt_container]: Load Library: /home/imother/ros2_isaac_ws/install/isaac_ros_tensor_rt/lib/libtensor_rt_node.so
[component_container-1] [INFO] [1636920885.337121942] [isaac_ros_tensor_rt.tensor_rt_container]: Found class: rclcpp_components::NodeFactoryTemplate<isaac_ros::dnn_inference::TensorRTNode>
[component_container-1] [INFO] [1636920885.337256022] [isaac_ros_tensor_rt.tensor_rt_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<isaac_ros::dnn_inference::TensorRTNode>
[component_container-1] [INFO] [1636920885.359633408] [tensor_rt]: /home/imother/ros2_isaac_ws/install/isaac_ros_tensor_rt/share/isaac_ros_tensor_rt
[component_container-1] [INFO] [1636920885.363188241] [tensor_rt]: Creating context
[component_container-1] 2021-11-14 20:14:45.476 ERROR gxf/std/extension_loader.cpp@109: librmw_cyclonedds_cpp.so: cannot open shared object file: No such file or directory
[component_container-1] [ERROR] [1636920885.476188550] [tensor_rt]: LoadExtensionManifest Error: GXF_EXTENSION_FILE_NOT_FOUND
[component_container-1] [ERROR] [1636920885.485840148] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486131765] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486270006] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486320694] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486366358] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486512823] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486570327] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486614040] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486653848] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486693592] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486787736] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486840569] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486925337] [tensor_rt]: GXF Entity find failed
[component_container-1] [ERROR] [1636920885.486972697] [tensor_rt]: GXF Entity find failed
[component_container-1] [INFO] [1636920885.487014041] [tensor_rt]: Initializing...
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/tensor_rt' in container '/isaac_ros_tensor_rt/tensor_rt_container'
[component_container-1] [INFO] [1636920885.487970462] [tensor_rt]: Running...

by the way, how to choose the image topic to make the inference?

hemalshahNV commented 2 years ago

The Docker container helps ensure that the configuration of library versions and locations of files are all consistent with the way we had tested it at least. Let's see if we can figure out what's happening here though. The fault is coming on the line ERROR gxf/std/extension_loader.cpp@109: librmw_cyclonedds_cpp.so: cannot open shared object file: No such file or directory which indicates that internal libraries Isaac ROS could not find shared libraries for CycloneDDS (not on the LD_LIBRARY_PATH). It is strange that GXF is complaining about not finding CycloneDDS while loading it own shared libraries dynamically which should not be linked against any DDS implementation and even more strange that it is looking for CycloneDDS and not the Foxy default FastRTPS. Could you check with ldd the file in /workspaces/isaac_ros-dev/ros_ws/install/isaac_ros_nvengine/share/isaac_ros_nvengine/gxf/libgxf_ros_bridge.so and grep for any mention of librmw_cyclonedds_cpp.so? There should not be any such dependency on a specific DDS implementation, CycloneDDS or FastRTPS. Are you running with ROS2 Foxy (DDS default is FastRTPS) or something newer? You could try setting the env variable RMW_IMPLEMENTATION=rmw_fastrtps_cpp like in docker/Dockerfile.aarch64.base and trying launching again to see if that fixes the issue.

For your other question, the isaac_ros_tensor_rt.launch.py file only brings up TensorRT inference node and nothing else. You would have to send images to an encoder/pre-processor node which would then send a TensorList to the TensorRT node and a decoder/post-processor node would interpret the inference result into something useful (bounding box, image mask, etc.). See launch files in isaac_ros_unet or isaac_ros_dope for an example of bringing up the subgraph to make this work.

FPSychotic commented 2 years ago

Could you check with ldd the file in /workspaces/isaac_ros-dev/ros_ws/install/isaac_ros_nvengine/share/isaac_ros_nvengine/gxf/libgxf_ros_bridge.so and grep for any mention of librmw_cyclonedds_cpp.so? There should not be any such dependency on a specific DDS implementation, CycloneDDS or FastRTPS. Are you running with ROS2 Foxy (DDS default is FastRTPS) or something newer? You could try setting the env variable

yes, there is a reference to that library. I'm in ubuntu 20.04 with Foxy in jetson xavier ,jp46, I installed foxy from apt, and later added other WS over it with colcon build, isaac VO gem works. add the variable in the same terminal before run the tensorrt launch didn't work. ldd libgxf_ros_bridge.so linux-vdso.so.1 (0x0000007fa2f5e000) libgxf_core.so => /home/imother/ros2_isaac_ws/install/isaac_ros_nvengine/share/isaac_ros_nvengine/gxf/./core/libgxf_core.so (0x0000007fa2d51000) libcudart.so.10.2 => /usr/local/cuda/lib64/libcudart.so.10.2 (0x0000007fa2cc9000) libisaac_ros_nvengine_interfaces__rosidl_typesupport_cpp.so => /home/imother/ros2_isaac_ws/install/isaac_ros_nvengine_interfaces/lib/libisaac_ros_nvengine_interfaces__rosidl_typesupport_cpp.so (0x0000007fa2cb6000) liblibstatistics_collector.so => /opt/ros/foxy/lib/liblibstatistics_collector.so (0x0000007fa2ca0000) librcl.so => /opt/ros/foxy/lib/librcl.so (0x0000007fa2c59000) librclcpp.so => /opt/ros/foxy/lib/librclcpp.so (0x0000007fa2a8f000) librcutils.so => /opt/ros/foxy/lib/librcutils.so (0x0000007fa2a6a000) librmw_cyclonedds_cpp.so=> not found here, above thanks!!

FPSychotic commented 2 years ago

well, I can update a little. I solved the error of the cyclonedds library. I found a error a little similar https://github.com/ros2/rmw_cyclonedds/issues/182 I added this to my .bash export LD_LIBRARY_PATH=/opt/ros/foxy/lib/aarch_64-linux-gnu${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} installed this sudo apt install ros-foxy-rmw-cyclonedds-cpp this solved the previous error, but when I execute r:~/ros2_isaac_ws/src/isaac_ros_dnn_inference-main/isaac_ros_tensor_rt/launch$ ros2 launch ./isaac_ros_tensor_rt.py I get this new error, I hope more easy to debug and more related with nvidia:

[INFO] [launch]: All log files can be found below /home/imother/.ros/log/2021-12-18-22-50-17-500483-imother-23265
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [component_container-1]: process started with pid [23278]
[component_container-1] [INFO] [1639867818.397675668] [isaac_ros_tensor_rt.tensor_rt_container]: Load Library: /home/imother/ros2_isaac_ws/install/isaac_ros_tensor_rt/lib/libtensor_rt_node.so
[component_container-1] [INFO] [1639867818.403629776] [isaac_ros_tensor_rt.tensor_rt_container]: Found class: rclcpp_components::NodeFactoryTemplate<isaac_ros::dnn_inference::TensorRTNode>
[component_container-1] [INFO] [1639867818.403761808] [isaac_ros_tensor_rt.tensor_rt_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<isaac_ros::dnn_inference::TensorRTNode>
[component_container-1] [INFO] [1639867818.421617378] [tensor_rt]: /home/imother/ros2_isaac_ws/install/isaac_ros_tensor_rt/share/isaac_ros_tensor_rt
[component_container-1] [INFO] [1639867818.421752098] [tensor_rt]: Creating context
[component_container-1] [INFO] [1639867818.542843495] [tensor_rt]: Loading app: '/home/imother/ros2_isaac_ws/install/isaac_ros_tensor_rt/share/isaac_ros_tensor_rt/config/tensor_rt_inference.yaml'
[component_container-1] [INFO] [1639867818.559092443] [tensor_rt]: Initializing...
[component_container-1] [INFO] [1639867818.782266291] [tensor_rt]: Running...
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/tensor_rt' in container '/isaac_ros_tensor_rt/tensor_rt_container'
[component_container-1] [libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:9: Message type "onnx2trt_onnx.ModelProto" has no field named "version".
[component_container-1] 2021-12-18 22:50:20.753 ERROR gxf/tensor_rt/tensor_rt_inference.cpp@143: TRT ERROR: ModelImporter.cpp:682: Failed to parse ONNX model from file: /home/imother/ros2_isaac_ws/src/isaac_ros_dnn_inference-main/isaac_ros_tensor_rt/launch/../../test/models/mobilenetv2-1.0.onnx
[component_container-1] 2021-12-18 22:50:20.753 ERROR gxf/tensor_rt/tensor_rt_inference.cpp@463: Failed to parse ONNX file /home/imother/ros2_isaac_ws/src/isaac_ros_dnn_inference-main/isaac_ros_tensor_rt/launch/../../test/models/mobilenetv2-1.0.onnx
[component_container-1] 2021-12-18 22:50:20.753 ERROR gxf/tensor_rt/tensor_rt_inference.cpp@276: Failed to create engine plan for model /home/imother/ros2_isaac_ws/src/isaac_ros_dnn_inference-main/isaac_ros_tensor_rt/launch/../../test/models/mobilenetv2-1.0.onnx.

to be honest due to my general lack of knowledge I cannot understand very well what I need to do to use the a d435 in this node. it run a model, but I cannot see how to choose the video source or topic thanks!`

hemalshahNV commented 2 years ago

Please retry with the latest Isaac ROS release.