Open c3m3gyanesh opened 1 year ago
Yes, it's not optimal, see https://github.com/Telecommunication-Telemedia-Assessment/bitstream_mode3_p1204_3/issues/25 and https://github.com/Telecommunication-Telemedia-Assessment/bitstream_mode3_p1204_3/issues/16#issue-722268021 — a good optimization would be to stream the feature parsing output to line-delimited JSON files instead of one big array and then parse that step by step. Right now there are no resources to rework this part though.
Note that the use case of this model is for really short videos of 4–10 seconds length. Anything longer you should definitely split up. I think a poor man's solution would be to do the splitting manually via ffmpeg beforehand and then calling the tool on each file individually.
ffmpeg -i "$input_video" -c copy -f segment -segment_time 10 -reset_timestamps 1 "$output_directory/segment_%03d.mkv"
for segment_file in "$output_directory"/*.mkv; do
# call P.1204.3 bitstream model and store in JSON
# ... parse the JSON individually
done
Due to extreme memory requirements by bitstream parser, it is almost impossible to use this code on videos with longer duration e.g. 5min, 10mins. One of the solution is to split videos into 8-10s. Is it possible to add the feature to this code to run a list of video files of 10s and generate a combined output which showcase 'per second' for all frames and 'per sequence' for all split videos combined (treating it as a single video)?