blackadar / sonus

Machine Learning voice classification using scikit-learn RandomForest model.
4 stars 0 forks source link

Data Collection #2

Closed blackadar closed 4 years ago

blackadar commented 4 years ago

We need labeled data, which are audio recordings to be used for training and testing.

We need to decide on a standard format, and sampling rate for these files.

Crichmond21 commented 4 years ago

I sent a request to get the VoxCeleb dataset. There arent many publically available datasets for audio classification but this looks like a really good one. If we get this we will either need only store the data locally or make this repo private