orbbec / OrbbecSDK_ROS1

OrbbecSDK ROS wrapper
Apache License 2.0
56 stars 36 forks source link

Poor performance of ROS1 Node compared to OrbeccViewer #53

Open marcomasa opened 1 month ago

marcomasa commented 1 month ago

I am unsure how to describe this perfectly, but I am noticing incredibly large performance differences between the ROS1 node and the standalone OrbeccViewer.

In the OrbeccViewer, I am able to effortlessly stream 4K color and max res. depth images, merge them to a PCL and have somewhat stable 30fps throughout the session.

When I launch the minimal configuration (1280x720), I get very unstable recordings (on various machines) and even complete frame freezes for certain scenes. Is this a known issue?

jian-dong commented 1 month ago

hi @marcomasa The performance differences you're noticing between the ROS1 node and the standalone OrbeccViewer are primarily due to how data is handled and processed in each case.

In the OrbeccViewer, the software directly accesses and renders the data with minimal overhead, which allows it to handle high-resolution streams like 4K and depth images efficiently, maintaining stable frame rates.

On the other hand, the ROS1 node introduces additional overhead related to data transmission, such as publishing and subscribing to topics. This includes serialization, deserialization, and the time it takes to transmit data between processes, which can contribute to performance issues, especially at higher resolutions. The frame freezes and instability you're observing could be due to these transmission overheads and potentially how ROS handles the processing pipeline in your specific setup.

If performance is critical, optimizing the ROS pipeline, reducing message sizes, and offloading heavy computations outside ROS nodes could help mitigate some of these issues.

marcomasa commented 1 month ago

Thank you for the details!

I noticed that the output format of recordings with the Windows Orbecc Viewer are still ROS1 bags, which show no frame drops when I record on the same machine.

This would (for the while) be a valid alternative for my use case, but unfortunately there is no option to record the image streams and IMU data at the same time. I can enable and stream them both, but there is no option to add the data to the bag.

Is it planned to add this option to the standalone viewer?

bmegli commented 1 month ago

@marcomasa

Just some hints, not all may apply to to your case.

If you look at example sensor workflow, say Femto Bolt

Data from sensor

Native data at higher resolutions is MJPEG:

image

ROS driver

This data is decompressed by ROS driver

If you are lucky (Nvidia or Rockchip) you may try using hardware decoder:

ROS driver workflow

If you try to record in ROS compressed data through image_transport (like compressed) you end up with workflow

Unless care has been taken by ROS driver writer:

On the other hand If you try to record in uncompressed format

What might be done

You might shortcut ROS driver to publish MJPEG data directly (this is like compressed image_transport) in ROS.

You might also try to decouple compression from driver by image_transport republish.

Finally using nodelets instead of nodes may eliminate some unnecessary data transmission

At higher resolutions those are really huge amounts of data

Higher resolutions in video processing are generally not suitable for uncompressed transmission/storage. They may also be not suitable for software processing at high framerate.

Unless you are careful, it is very easy to have bottleneck in the workflow.

marcomasa commented 1 month ago

Hi @bmegli !

First of all, thank you so much for your detailed input!


If you are lucky (Nvidia or Rockchip) you may try using hardware decoder:

Regarding the HW decoding flags, I unfortunately do not run the camera with a Jetson or Rockchip HW board, so i cannot use them. On the main system I even run in a WSL2 environment (see also #48 ), so i might even lack additional drivers. However, I also tested on a native linux machine, and I still got very high CPU usage for the Linux driver. I might add system info and running specs later.


ROS driver workflow

If you try to record in ROS compressed data through image_transport (like compressed) you end up with workflow

  • compressed data from sensor

    • decompressed by Orbbec SDK ROS (+ maybe color conversion)

    • compressed again by ROS image transport

I am actually recording non-compressed data at the moment. And yes, you are right, the amounts of data are huge, but enabling compression took down the actual framerate even further (like somewhat to be expected).


What might be done

You might shortcut ROS driver to publish MJPEG data directly (this is like compressed image_transport) in ROS.

  • after checking with 2-3 lines of code how much time decompression/compression takes, it may not be worth the effort

Do you have the lines / the timing somewhere available ? :)


Finally using nodelets instead of nodes may eliminate some unnecessary data transmission

I will try setting up launchfiles for that. I think orbecc already provides the nodelets in general, but I have not seen any launching configurations in the repo.