Open aiPenguin opened 1 month ago
this is because you're re-encoding the downloaded videos when splitting. Try adding:
subsampling:
ClippingSubsampler:
args:
precision: keyframe_adjusted
in the config and it would orders of magnitude faster
Hi,
can anyone share your resource usage and settings for downloading the dataset?
We have tried to download the 10M sub-dataset in 720p without audio. But it requires more than 15K CPU hours.
Is there anything wrong?
Thx.