bepierre / SpeechVGG

Feature extractor for DL speech processing.
GNU General Public License v3.0
65 stars 13 forks source link

Bug in test_TIMIT.py #4

Closed Aditya-shahh closed 3 years ago

Aditya-shahh commented 3 years ago

Hello,

In examples/speaker_identification/test_TIMIT.py at line 84, we have:

data_tmp[:,:,0] = log_standardize(data_tmp[:,:,0])

Before applying log standardization, the signal needs to be padded; similar to how it is performed in the data_generator.py file at line 138

So here before line 84 in test_TIMIT.py, please add the code: data_tmp = pad_spec(data_tmp)

Let me know if it's identified correctly.

Thank you

whmrtm commented 3 years ago

I also found that the mean-variance normalization does not seem to be very rigorous. The normal way to do it is to perform (x - mean)/std. Variance in this repository is estimated instead of accurate calculation

bepierre commented 3 years ago

Hey, Thank you for sharing possible mistakes I might have made :)

@Aditya-shahh: I'm not sure that it would make sense to pad here as all test audio clips are longer than a second. @whmrtm: I'm not sure what you are saying here. We calculated the mean and variance and saved it in a separate file and then load it to standardize the data.

Please let me know if I'm missing something.

bepierre commented 3 years ago

closing this for now