/libs/data_generator.py file has a bug

Aditya-shahh commented 3 years ago

I'm working on Language Identification using SpeechVGG. There's a bug in line 149 at /libs/data_generator.py

the code label_tmp = h5f['word_idx'][0] is incorrect as there is not key named 'word_idx'

this correct line will be: label_tmp = h5f['class'][0]

Do check it once and let me know.

Thank you :)

MKegler commented 3 years ago

Indeed, that seems to be it, thanks for spotting that one! The data generator was originally designed to generate words for classification task in pre-training (thus 'word_idx' key).

Replacing all 'word_idx' with 'class' seems like a good idea since the generator should cover all the cases. I will start updating the code and, for now, add a note in 'speaker identification' example about the above.

@Aditya-shahh For now, please edit the part of data_generator for your use case while I edit the code.

@bepierre any further thoughts on that?

TODO:

[x] Update speaker ID readme (temporary fix).
[x] Replace 'word_idx' with 'class' across the code.

Aditya-shahh commented 3 years ago

Sure @MKegler

Let me know if I can submit a pull request. Since I have gone through the code, so I can edit the same

MKegler commented 3 years ago

Just pushed the updated code, but thanks for offering! Everything should work now with one data generator. Closing this issue now, but please do feel free to reopen if the problem persists.

bepierre / SpeechVGG

/libs/data_generator.py file has a bug #3