josebeo2016 / SCL-Deepfake-audio-detection

Synthesis speech detection based on vocoder signature extraction with Supervised Contrastive Learning
Apache License 2.0
13 stars 1 forks source link

SCL-Deepfake-audio-detection

This is official implementation of our work "Balance, Multiple Augmentation, and Re-synthesis: A Triad Training Strategy for Enhanced Audio Deepfake Detection"

Preparing

Please download those dataset by yourself.

Dataset

We do not redistributed training data as:

DATA folder should look like this:

DATA
├── asvspoof_2019_supcon
│   ├── bonafide
│   │   └── leave_bonafide_wav_here
│   ├── eval
│   │   └── leave_eval_wav_here
│   ├── protocol.txt
│   ├── scp
│   │   ├── dev_bonafide.lst
│   │   ├── test.lst
│   │   └── train_bonafide.lst
│   └── vocoded
│       └── leave_vocoded_wav_here
├── asvspoof_2021_DF
│   ├── flac -> /datab/Dataset/ASVspoof/LA/ASVspoof2021_DF_eval/flac
│   ├── protocol.txt
│   └── trial_metadata.txt
└── in_the_wild
    ├── in_the_wild.txt
    ├── protocol.txt
    └── wav -> /datab/Dataset/release_in_the_wild

Configurations

Configuration should be checked and modified before further training or evaluating. Please read configuration files carefully.

By default, these configurations is set for training.

Training

CUDA_VISIBLE_DEVICES=0 bash 02_train.sh <seed> <config> <data_path> <comment>

For example:

CUDA_VISIBLE_DEVICES=0 bash 02_train.sh 1234 configs/conf-3-linear.yaml DATA/asvspoof_2019_supcon "conf-3-linear-1234"

Evaluating

CUDA_VISIBLE_DEVICES=0 bash 03_eval <config> <data_path> <batch_size> <model_path> <eval_output>

For example:

CUDA_VISIBLE_DEVICES=0 bash 03_eval.sh configs/conf-3-linear.yaml DATA/asvspoof_2019_supcon 128 out/model_80_1_1e-07_conf-3-linear/epoch_80.pth docs/la19.txt

Calculate EER

Please refer to Result.ipynb for calculating EER and other performance metrics.

Customized training and evaluating dataset

Please refer to datautils/eval_only.py and datautils/asvspoof_2019.py for other eval dataset. For augmentation strategies, please refer to datautils/asvspoof_2019_augall_3.py.

Reference