difference between wavfile and soundfile?

clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition

MIT License

1.02k stars 272 forks source link

difference between wavfile and soundfile? #107

Closed seacj closed 3 years ago

seacj commented 3 years ago

audio_path = 'voxceleb1/id10270/8jEAjG6SegY/00005.wav'
sample_rate, audio0  = wavfile.read(audio_path)
print(audio0) # [ 65  63 123 ... 162 169 119]
audio0, sample_rate  = sf.read(audio_path)
print(audio0) # [0.00198364 0.00192261 0.00375366 ... 0.00494385 0.00515747 0.00363159]

Here the scale is dfferent. Does it affect the performance of the model because the input feature changes? (I also notised that wavfile is faster than soundfile, but the latest code adopt soundfile)

ukemamaster commented 3 years ago

wavfile.read() does not apply any normalization, whereas soundfile.read() applies normalization. You need to normalize the data returned by wavfile.read() as: audio0 = audio0/32767. to get similar output as soundfile.read().

import numpy as np
a = [65,  63, 123, 162, 169, 119]
np.array(a)/32767.

output: array([0.0019837 , 0.00192267, 0.00375378, 0.004944 , 0.00515763, 0.0036317 ])

which is the output of soundfile.read()

I am not sure if it affects the performance. @joonson will be able to answer this better.
Yes, wavfile.read() is faster than soundfile.read()

ukemamaster commented 3 years ago

@seacj I have reproduced similar results using wavfile.read() without any normalization, so it seems that it doesn't affect the performance.

seacj commented 3 years ago

@seacj I have reproduced similar results using wavfile.read() without any normalization, so it seems that it doesn't affect the performance.

Thank you. That's right.