ShareGPT4Omni / ShareGPT4Video

[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
https://sharegpt4video.github.io/
1.26k stars 44 forks source link

Data Preprocessing #36

Open exiawsh opened 3 months ago

exiawsh commented 3 months ago

Hi, thanks for your great work! I notice that some raw videos in your huggingface dataset are longer than the timestamps your record in json file. For example, in ego4d the video may be last 60 seconds, but only 12 seconds caption are recorded based on the timestamps. Do we need to clip the videos based on the timestamps when tranining the model?

Sent from PPHub