hche11 / Localizing-Visual-Sounds-the-Hard-Way

Localizing Visual Sounds the Hard Way
Apache License 2.0
72 stars 14 forks source link

problem in downloading the VGGSS dataset #10

Open Ehsan-Yaghoubi opened 2 years ago

Ehsan-Yaghoubi commented 2 years ago

Thanks for sharing the codes.

Have you annotated the raw videos from youtube? or you have annotated the processed videos of vggs dataset? I am asking this because the names of the videos in the vggss.json file are not identical to the .csv files of the vggs dataset. Otherwise, if you have processed the raw videos from youtube, did you select the middle frame (as said in the paper) regardless of the vggs dataset?

I am confused about how to find the middle frame. I appreciate it if you could share the code that helps to download the dataset.