Closed Liang-Qiu closed 7 years ago
Yes, good catch, that is a bug in vggish_train_demo.py, also spotted on the mailing list https://groups.google.com/forum/#!topic/audioset-users/akY6p378zyA
The bug isn't critical since the demo itself is intended to show how to train the model, and we don't particularly care about the inputs or outputs.
If you'd like to submit a pull request to fix this, we'll be happy to accept :) Else, we'll get around to fixing this sometime soon.
@plakal , thanks for your response. Glad to hear that you have noticed this.
Another issue for me is that since I want to train my own classifier for 11 classes chosen from your ontology, is there a convenient way to download the Youtube audios as .wav files? I want to tune the parameters in the feature extractor so I may not use your released features directly. You also released 3 .csv files which includes the Youtube video ids you used, so did you provide a path to download the raw data?
Thanks a lot.
If you have other issues beyond features and bugs, let's discuss it on the audioset-users mailing list and not have the discussion here.
There was already a thread earlier this year about this topic https://groups.google.com/forum/#!topic/audioset-users/DbyaYxzwnDU.
@plakal Got it. Thank you! I will close this.
I'd like to leave this open to track the bug in the training demo.
Hi @plakal and @dpwe, For the function _get_examples_batch() in vggish_train_demo.py (line117), do you think it should use np.concatenate() instead of adding them up as tuples.