andrewowens / multisensory

Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
http://andrewowens.com/multisensory/
Apache License 2.0
220 stars 61 forks source link

Issue on datasets #35

Open LindaCY opened 5 years ago

LindaCY commented 5 years ago

Hello, thanks for your great work. Here are some questions when I read your paper and implement the work. In the paper, you said "you trained your model on a dataset of approximately 750,000 videos sampled from AudioSet." As we know, AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos.

ruizewang commented 5 years ago

The same question, follow this issue.