LetterLiGo / SafeEar

SafeEar: Content Privacy-Preserving Audio Deepfake Detection (Accepted by CCS 2024)
https://safeearweb.github.io/Project/
Other
45 stars 8 forks source link

SafeEaricon: Content Privacy-Preserving Audio Deepfake Detection

arXiv PRs Welcome CC BY 4.0 GitHub stars GitHub forks Website

By [1] Zhejiang University, [2] Tsinghua University.

This repository is an official implementation of the SafeEar accepted to ACM CCS 2024 (Core-A*, CCF-A, Big4) .

Please also visit our website.

✨Key Highlights:

In this paper, we propose SafeEar, a novel framework that aims to detect deepfake audios without relying on accessing the speech content within. Our key idea is to devise a neural audio codec into a novel decoupling model that well separates the semantic and acoustic information from audio samples, and only use the acoustic information (e.g., prosody and timbre) for deepfake detection. In this way, no semantic content will be exposed to the detector. To overcome the challenge of identifying diverse deepfake audio without semantic clues, we enhance our deepfake detector with multi-head self-attention and codec augmentation. Extensive experiments conducted on four benchmark datasets demonstrate SafeEar’s effectiveness in detecting various deepfake techniques with an equal error rate (EER) down to 2.02%. Simultaneously, it shields five-language speech content from being deciphered by both machine and human auditory analysis, demonstrated by word error rates (WERs) all above 93.93% and our user study. Furthermore, our benchmark constructed for anti-deepfake and anti-content recovery evaluation helps provide a basis for future research in the realms of audio privacy preservation and deepfake detection.

🚀Overall Pipeline

pipeline

🔧Installation

  1. Clone the repository:
git clone git@github.com:LetterLiGo/SafeEar.git
cd SafeEar/
  1. Create and activate the conda environment:
conda create -n safeear python=3.9 
conda activate safeear
  1. Install PyTorch and torchvision following the official instructions. The code requires python=3.9, pytorch=1.13, torchvision=0.14.
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
  1. Install other dependencies:
#pip < 24.0 is needed
pip install -r requirements.txt

📊Model Performance

ASVspoof 2019 & 2021

Speech Recognition Performance

📚Training

Before starting training, please modify the parameter configurations in configs.

Use the following commands to start training:

python train.py --conf_dir config/train19.yaml
python train.py --conf_dir config/train21.yaml

📈Testing/Inference

To evaluate a model on one or more GPUs, specify the CUDA_VISIBLE_DEVICES, dataset, model and checkpoint:

python test.py --conf_dir Exps/ASVspoof19/conf.yml
python test.py --conf_dir Exps/ASVspoof21/conf.yml

📜Citation

If you find our work helpful, please consider citing:

@inproceedings{li2024safeear,
  author       = {Li, Xinfeng and Li, Kai and Zheng, Yifan and Yan, Chen and Ji, Xiaoyu, and Xu, Wenyuan},
  title        = {{SafeEar: Content Privacy-Preserving Audio Deepfake Detection}},
  booktitle    = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)}
  year         = {2024},
}