guyyariv / TempoTokens

This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
https://pages.cs.huji.ac.il/adiyoss-lab/TempoTokens/
MIT License
101 stars 10 forks source link

Filtered VGGSound file list? #2

Closed miraodasilva closed 11 months ago

miraodasilva commented 11 months ago

image Hi,

First of all, really like the paper :) I think it presents an elegant solution and impressive results. As seen above, the paper mentions a filtered version of VGGSound, which seems like a really good idea since the dataset is usually quite noisy. I was wondering if you could share the file list for this filtered version (as a .csv, for example) ? I looked around in the repo, and the only list I could find was datasets/vggsound.csv, which seems to contain (almost?) the whole dataset, rather than the filtered version. Any help would be greatly appreciated. Thank you.

guyyariv commented 11 months ago

Hello, we appreciate your comment! :) You can access the "constants/unmatch_videos.pkl" pickle file to find a list of YouTube IDs where the audio doesn't match the video. Please note that these YouTube IDs are only from a subset of vggsound's training set. By filtering out these videos, you'll obtain the training set that we used for our training.