HPI-DeepLearning / crnn-lid

Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks
GNU General Public License v3.0
105 stars 48 forks source link

Even data distribution #10

Closed ibro45 closed 5 years ago

ibro45 commented 5 years ago

Hi,

Just a quick question - are the youtube audio samples evenly distributed between languages in some step? If they are, would you please tell me in which script is it happening? If they aren't, could you explain why? Sorry, I've just read in the paper that they are.

I'd also like to thank you for writing the paper in a really comprehensive way!

Bartzi commented 5 years ago

Yes, the audio files are evenly distributed. This function that is for instance used by this script, distributes the audio files.

ibro45 commented 5 years ago

Perfect, thank you very much!