clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.02k stars 272 forks source link

difference between wavfile and soundfile? #107

Closed seacj closed 3 years ago

seacj commented 3 years ago
audio_path = 'voxceleb1/id10270/8jEAjG6SegY/00005.wav'
sample_rate, audio0  = wavfile.read(audio_path)
print(audio0) # [ 65  63 123 ... 162 169 119]
audio0, sample_rate  = sf.read(audio_path)
print(audio0) # [0.00198364 0.00192261 0.00375366 ... 0.00494385 0.00515747 0.00363159]

Here the scale is dfferent. Does it affect the performance of the model because the input feature changes? (I also notised that wavfile is faster than soundfile, but the latest code adopt soundfile)

ukemamaster commented 3 years ago
import numpy as np
a = [65,  63, 123, 162, 169, 119]
np.array(a)/32767.

output: array([0.0019837 , 0.00192267, 0.00375378, 0.004944 , 0.00515763, 0.0036317 ])

which is the output of soundfile.read()

ukemamaster commented 3 years ago

@seacj I have reproduced similar results using wavfile.read() without any normalization, so it seems that it doesn't affect the performance.

seacj commented 3 years ago

@seacj I have reproduced similar results using wavfile.read() without any normalization, so it seems that it doesn't affect the performance.

Thank you. That's right.