Added a duration parameter to the convert_video_to_gxf_entities.py to convert only first N seconds

dleshchev commented 1 month ago

Added a duration paramter to convert_video_to_gxf_entities.py utility to only parse first N seconds of the downloaded video. The parameter can help to save time when building new examples/applications: now one does not have to look for a short video. The default is set to process the full video so old app building is unchanged.

gigony commented 1 month ago

Thanks @dleshchev for the suggestion!

convert_video_to_gxf_entities.py script is intended to be used with external tools such as ffmpeg.

Please see the description in Holoscan SDK and the code search in Holohub repo (below image).

Since ffmpeg provides a way to set the range of the video to decode (please see the information below that I excerpted from ChatGPT's query result, which can be copy-pasted into the user guide), wouldn't it be better to improve the documentation on using ffmpeg rather than adding a --duration parameter to convert_video_to_gxf_entities.py?

What do you think?

To set the start and end times (or frames) in a ffmpeg command, you can use the -ss option to specify the start time and the -t or -to option to specify the duration or end time.

Specify Start Time: Use the -ss option to set the start time.
Specify End Time or Duration: Use the -t option to set the duration or the -to option to set the end time.

Example Command

Using Start Time and Duration

ffmpeg -ss 00:00:10 -i video_1920x1080.avi -t 00:00:20 -pix_fmt rgb24 -f rawvideo pipe:1 | python3 scripts/convert_video_to_gxf_entities.py --width 1920 --height 1080 --channels 3 --framerate 30

This command starts processing the video at 10 seconds (-ss 00:00:10) and processes for 20 seconds (-t 00:00:20), , converting the specified portion of the video to raw video format and piping it to the Python script.

Using Start Time and End Time

ffmpeg -ss 00:00:10 -i video_1920x1080.avi -to 00:00:30 -pix_fmt rgb24 -f rawvideo pipe:1 | python3 scripts/convert_video_to_gxf_entities.py --width 1920 --height 1080 --channels 3 --framerate 30

This command starts processing the video at 10 seconds (-ss 00:00:10) and stops at 30 seconds (-to 00:00:30), converting the specified portion of the video to raw video format and piping it to the Python script.

Explanation

-ss: Sets the start time of the input video. The format is HH:MM:SS or in seconds.
-t: Sets the duration of the output video. The format is HH:MM:SS or in seconds.
-to: Sets the end time of the output video. The format is HH:MM:SS or in seconds.

dleshchev commented 4 weeks ago

@gigony @tbirdso Thanks a lot for looking into this @gigony you are right, ffmpeg is already capable of doing the right thing. I tried it and it works smoothly. I think the best place to update the documentation is in the holoscan repo. I will submit a PR with updates accordingly.

nvidia-holoscan / holohub