As title, added google speech commands dataset, with some choice designs for preprocessing as described in the next section.
Proposed Changes
Added speech commands dataset
Dataset partitioning into train / test using testing_list.txt from the dataset download
validation_list.txt is still provided for use to use sampler in the dataloader if train/valid/test is desired
Using include keyword for defining custom subset of dataset. All other words in the dataset are marked as unknown (not sure if this adheres to the original split, but couldn't find more information about this)
Using silence keyword to include samples from _background_noise_ folder
Discussion
silence: currently the DataSet simply points to samples in the _background_noise_ folder. These samples are 1min long, whereas the speech commands are 1sec long. My current workaround is to use RandomCrop with 16,000 samples, which deals with this issue nicely. I don't think it would be efficient to chop up and store up 1 second clips of the silence clips.
Summary
As title, added google speech commands dataset, with some choice designs for preprocessing as described in the next section.
Proposed Changes
testing_list.txt
from the dataset downloadvalidation_list.txt
is still provided for use to usesampler
in the dataloader if train/valid/test is desiredinclude
keyword for defining custom subset of dataset. All other words in the dataset are marked asunknown
(not sure if this adheres to the original split, but couldn't find more information about this)silence
keyword to include samples from_background_noise_
folderDiscussion
silence
: currently the DataSet simply points to samples in the_background_noise_
folder. These samples are 1min long, whereas the speech commands are 1sec long. My current workaround is to use RandomCrop with 16,000 samples, which deals with this issue nicely. I don't think it would be efficient to chop up and store up 1 second clips of thesilence
clips.