How to decode h264 encoded images

umdlife / psdk_ros2

This repository is a ROS 2 wrapper for the DJI PSDK libraries.

https://umdlife.github.io/psdk_ros2/documentation/Introduction.html

Mozilla Public License 2.0

50 stars 13 forks source link

How to decode h264 encoded images #114

Open michele-colombo opened 3 days ago

michele-colombo commented 3 days ago

Hi, I see you added support for publishing h264 encoded images from a streaming camera of the drone. From what I see, by setting decoded_output to 0 when calling the camera setup streaming service, you should obtain a topic of type sensor_msgs/Image where the field encoding is "h264". My question is: how do I decode this stream of messages in a node subscribed to the topic in order to obtain back raw images? In particular I'm interested in obtaining an opencv image for each published frame. h264 seems not to be a standard encoding for sensor_msgs/Image, so I would be really glad if you could provide me with some reference or example code on the intended usage. Thank you! Michele

vicmassy commented 3 days ago

Hey @michele-colombo, you can check out the official sample code of DJI where they demonstrate how to decode a frame. From there you can pass it to a cv::Mat pretty easy.

Hope it helps :smile:

michele-colombo commented 3 days ago

Thank you so much! I will look into it. The aim, of course, is to perform decoding leveraging hardware acceleration, if available. I see there is a related issue on this topic: https://github.com/umdlife/psdk_ros2/issues/8. Do you have plans of implementing it directly in the wrapper?

vicmassy commented 3 days ago

So natively DJI provides you with the image hardware encoded from the payload camera (i.e. H20). From there you can decode it with any resource you may have. For instance if you have an embedded computer with a GPU you could try to decode it leveraging that. Otherwise you can simply query the stream decoded already and it will be decoded with the CPU just as DJI shows in the sample and sent over ROS2 topic as raw.

As of now, the functionality to decode it with GPU from the wrapper itself is not planned if that's what you meant. But we welcome contributions if you would like to work on this.

michele-colombo commented 3 days ago

Thanks again. Yeah, I tried querying the decoded stream, but this brings the CPU to almost full load immediately. If I produce some working code in the next weeks I'll get back to you.