clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.03k stars 272 forks source link

About the data augment #63

Closed wuqiangch closed 3 years ago

wuqiangch commented 3 years ago

@joonson In rirs_noises,the files almost are multi-channel.So,Did you first convert files with multi-channel to single-channel? Because the function convolve in signal gets the two arrays which should have the same number of dimensions . If I dont do that ,I will get the error "ValueError: volume and kernel should have the same dimensionality" in code "signal.convolve(audio, rir, mode='full')[:,:self.max_audio]".

joonson commented 3 years ago

Can you give me the name of the file that is multi-channel? They look like single channel to me.

wuqiangch commented 3 years ago

@joonson ‘RIRS_NOISES/real_rirs_isotropic_noises/RWCP_type1_rir_circle_e1c_imp040.wav’

audio.shape
(12000, 16)

joonson commented 3 years ago

Did you extract the files using ./dataprep.py? This file should not be included.

wuqiangch commented 3 years ago

Oh, I'am sorry. I only download it. Thanks! It works.