Jakobovski / free-spoken-digit-dataset

A free audio dataset of spoken digits. An audio version of MNIST.
624 stars 250 forks source link

zero wav files too small to be read in python #26

Open ksasso1028 opened 5 years ago

ksasso1028 commented 5 years ago

I would suggest making all of the recordings a standard 1 second so we have access to all 8000 samples. the recordings vary greatly in sample length

Jakobovski commented 5 years ago

what do you mean "so we have access to all 8000 samples"? How does the length of the recordings effect accessibility?

On Mon, Oct 14, 2019 at 3:48 PM Kevin Sasso notifications@github.com wrote:

I would suggest making all of the recordings a standard 1 second so we have access to all 8000 samples. the recordings vary greatly in sample length

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Jakobovski/free-spoken-digit-dataset/issues/26?email_source=notifications&email_token=AAUZB4VR5EBZQEF6A3KFA73QOTEKNA5CNFSM4JATJNQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HRVZODQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAUZB4TXP6ZNGN6VCF5RZ23QOTEKNANCNFSM4JATJNQQ .

ksasso1028 commented 5 years ago

data is sampled at 8khz, so that means 8000 samples(values) per second. A lot of these recordings are not even one full second and it is important to be able to use at least one whole cycle of sampling for classification.

ksasso1028 commented 5 years ago

@Jakobovski

Raghav-Bansal commented 4 years ago

I would suggest making all of the recordings a standard 1 second so we have access to all 8000 samples. the recordings vary greatly in sample length

please can you tell step by step how to make the data of one sec

ksasso1028 commented 4 years ago

I would suggest making all of the recordings a standard 1 second so we have access to all 8000 samples. the recordings vary greatly in sample length

please can you tell step by step how to make the data of one sec

You can take a look at my repo here where i do something similar https://github.com/ksasso1028/KCN-AUDIO-CLASSIFICATION/blob/master/test.py Essentially you use librosa to zero pad the sample to be one second if it is too short

Kevin