snap-research / Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
https://snap-research.github.io/Panda-70M/
438 stars 15 forks source link

Is there any suggestions to acccelerate the process of splitting and captioning? #34

Open Kyfafyd opened 3 months ago

Kyfafyd commented 3 months ago

RT. As there are a large number of videos for processing, if there is an acceleration, such as multi-processing?

tsaishien-chen commented 3 months ago

Hi @Kyfafyd, Thanks for your interest in our method! To accelerate the captioning, you can increase the batch-size. To accelerate the splitting, you can separate your videos into different groups and launch multiple parallel processes by background commands or Python built-in multiprocessing package Hope this helps!

Kyfafyd commented 3 months ago

Hi, @tsaishien-chen Thanks for your reply! May I learn the exact approach you have used for splitting videos? And what is the estimated time for obtaining 2M sub-videos?

Could you please provide your code examples?

tsaishien-chen commented 3 months ago

Hi @Kyfafyd, Sorry but we don't keep the original source code. The computation time depends on the number and spec of your machines. I suggest you to check the CPU usage when splitting the videos and make sure you launch proper number of parallel processes to use all of your CPU.