Open DamRsn opened 2 years ago
The issue is also occurred in part of english/librivox speech. For example, english/librivox/enrol_read_speech_complete_reader_00381.wav contains multiple-speaker conversations. The speakers switch frequently and may affect the learning of personal noise suppression.
Hello,
During some informal listening of audio data from PDNS track, I noticed that Italian speech files contain multiple speakers. While in the ICASSP 2022 DEEP NOISE SUPPRESSION CHALLENGE paper, it is said: "PDNS track has clean speech where each audio clip is concatenations of all audio clips belonging to a talker."
The issue makes enrollments files and "clean" files incompatible. I had to remove all Italian files from my training dataset.
I suggest to either remove Italian data from PDNS dataset or to get the correct clean files if you have the possibility.
I noticed the issue only with Italian data and I don't know if that occurs sometimes in the others languages.
One example is
pdns_training_set/raw/clean/italian/complete_italian_novelle_per_un_anno_02.wav
where the speakers at the beggining and at 1:39:00 are clearly different. The beginning of that audio file is also the exact same speech sample as the corresponding enrollment.