JiJiJiang / ASV-Anti-Spoofing-DADA

Dual-Adversarial Domain Adaptation for replay spoofing detection in automatic speaker verification.
18 stars 5 forks source link

Data Augmentation #5

Closed leonardltk closed 2 years ago

leonardltk commented 2 years ago

Hi,

I dont see any data augmentation being done here, what kind of data augmentation would be suitable for BonaFide/Spoof data?

Regards, Leonard

JiJiJiang commented 2 years ago

Hey, you can generate more spoof data by adding reverberation on bonafide data. The well-knowns room impulse response (rir) dataset RIRS_NOISES may be a good choice.

leonardltk commented 2 years ago

Hi, i am familiar with adding reverberation and the dataset that you shared with me. However i dont think adding reverberation on bonafide data can be a spoof data. If we were to consider a real world scenario, where audio input is in a non studio quality, they should still be bonafide right ? I have some proposal on the kinds of data augmentation procedure, what are your thoughts ?

  1. Bonafide
    • Use original bonafide data
    • Add noise
    • Add reverb
    • Speed Perturb 90% 100% 110%
  2. Spoof
    • Use respective spoof data
    • Add noise
    • Add reverb
    • Speed Perturb 90% 100% 110%
JiJiJiang commented 2 years ago

Emmm... As you say, the speaker may talk at a far distance, or close to the microphone of sv system. Of course, these recordings are all bonafide. According to the definition of spoofed data, they are generated by replaying and re-recording the bonafide data. (You could refer to the evaluation plan of ASVspoof 2017 and 2019.) From my point of view, this process could add more reverberation and device/channel impact onto the audio, which could be the biggest difference between bonafide and spoofed data. So near-field noise and speed perturb could be fine, but be careful of adding reverb on bonafide data. You could do some experiments and test on real data. Maybe the results could tell.