xieyuankun / Codecfake

This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".
39 stars 3 forks source link

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

ALM-based deepfake audio

This is the official repo of our work titled The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio, which was available on arxiv at https://arxiv.org/abs/2405.04880. ## πŸ“š Codecfake Dataset Due to platform restrictions on the size of zenodo repositories, we have divided the Codecfake dataset into various subsets as shown in the table below: |Codecfake dataset | Description | Link | |------------------|----------------|--------------| |training set (part 1 of 3) & label|train_split.zip & train_split.z01 - train_split.z05| [https://zenodo.org/records/13838106](https://zenodo.org/records/13838106)| |training set (part 2 of 3) | train_split.z06 - train_split.z10| [https://zenodo.org/records/13841652](https://zenodo.org/records/13841652) |training set (part 3 of 3)| train_split.z11 - train_split.z16| [https://zenodo.org/records/13853860](https://zenodo.org/records/13853860) |development set|dev_split.zip & dev_split.z01 - dev_split.z02| [https://zenodo.org/records/13841216](https://zenodo.org/records/13841216) |test set (part 1 of 2)|Codec test: C1.zip - C6.cip & ALM test: A1.zip - A3.zip| [https://zenodo.org/records/13838823](https://zenodo.org/records/13838823) |test set (part 2 of 2)|Codec unseen test: C7.zip| [https://zenodo.org/records/11125029](https://zenodo.org/records/11125029) The Codecfake dataset is licensed with CC BY-NC-ND 4.0 license. ## πŸ›‘οΈ Countermeasure ### 1. Data prepraring Upon downloading the Codecfake datasets, please arrange them in accordance with the directory structure outlined below: ``` β”œβ”€β”€ Codecfake β”‚ β”œβ”€β”€ label β”‚ β”‚ └── *.txt β”‚ β”œβ”€β”€ train β”‚ β”‚ └── *.wav (740,747 samples) β”‚ β”œβ”€β”€ dev β”‚ β”‚ └── *.wav (92,596 samples) β”‚ β”œβ”€β”€ test β”‚ β”‚ └── C1 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C2 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C3 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C4 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C5 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C6 β”‚ β”‚ β”‚ └── *.wav (26,456 samples) β”‚ β”‚ └── C7 β”‚ β”‚ β”‚ └── *.wav (145,505 samples) β”‚ β”‚ └── A1 β”‚ β”‚ β”‚ └── *.wav (8,902 samplesοΌ‰ β”‚ β”‚ └── A2 β”‚ β”‚ β”‚ └── *.wav (8,902 samplesοΌ‰ β”‚ β”‚ └── A3 β”‚ β”‚ β”‚ └── *.wav (99,112 samplesοΌ‰ ``` If you want to co-training with ASVspoof2019, Please download the training, development and evaluation set from [ASVspoof2019 LA Database](https://datashare.ed.ac.uk/handle/10283/3336) first. ### 2. Offline Feature Extraction ``` python preprocess.py ``` Please ensure the data and label position are correct. If you need to adjust, please modify in `./preprocess.py`. After preprocess, the hidden states of wav2vec2 will be saved in `/data2/xyk/codecfake/preprocess_xls-r-5`. ### 3. Train Training on different task: ``` python main_train.py -t 19LA python main_train.py -t codecfake python main_train.py -t co-train ``` Before running the `main_train.py`, please change the `path_to_features` according to the location of pre-processed featrue on your machine. If training is slow, consider adjusting the num_worker parameter in conjunction with the number of CPU cores. The default is set to 8. If performance remains slow, you may explore multi-GPU training in args. ### 4. Test Testing on different datasets was performed using the best pre-trained model, which is saved in `./models/try/anti-spoofing_feat_model.pt`: ``` python generate_score.py -t 19LA python generate_score.py -t ITW python generate_score.py -t codecfake ``` The result will be saved in `./result`. ``` python evaluate_score.py ``` You will get the final test EER. ### 5. Pre-trained model We provide pre-trained models and score results as mentioned in our paper, you can use our pre-trained models testing on other condition. The inference step can refer section 4. #### Vocoder-trained ADD models:

``` ./pretrained_model/vocoder_mellcnn/anti-spoofing_feat_model.pt ./pretrained_model/vocoder_w2v2lcnn/anti-spoofing_feat_model.pt ./pretrained_model/vocoder_w2v2aasist/anti-spoofing_feat_model.pt ``` #### Codec-trained ADD models:

``` ./pretrained_model/codec_mellcnn/anti-spoofing_feat_model.pt ./pretrained_model/codec_w2v2lcnn/anti-spoofing_feat_model.pt ./pretrained_model/codec_w2v2aasist/anti-spoofing_feat_model.pt ``` #### Co-trained ADD model:

``` ./pretrained_model/cotrain_w2v2aasist/anti-spoofing_feat_model.pt ``` ## πŸ“ Citation If you find our dataset or countermeasure is useful to your research, please cite it as follows: ``` @article{xie2024codecfake, title={The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio}, author={Xie, Yuankun and Lu, Yi and Fu, Ruibo and Wen, Zhengqi and Wang, Zhiyong and Tao, Jianhua and Qi, Xin and Wang, Xiaopeng and Liu, Yukun and Cheng, Haonan and others}, journal={arXiv preprint arXiv:2405.04880}, year={2024} } ```