By [1] Zhejiang University, [2] Tsinghua University.
This repository is an official implementation of the SafeEar accepted to ACM CCS 2024 (Core-A*, CCF-A, Big4) .
Please also visit our website.
In this paper, we propose SafeEar, a novel framework that aims to detect deepfake audios without relying on accessing the speech content within. Our key idea is to devise a neural audio codec into a novel decoupling model that well separates the semantic and acoustic information from audio samples, and only use the acoustic information (e.g., prosody and timbre) for deepfake detection. In this way, no semantic content will be exposed to the detector. To overcome the challenge of identifying diverse deepfake audio without semantic clues, we enhance our deepfake detector with multi-head self-attention and codec augmentation. Extensive experiments conducted on four benchmark datasets demonstrate SafeEar’s effectiveness in detecting various deepfake techniques with an equal error rate (EER) down to 2.02%. Simultaneously, it shields five-language speech content from being deciphered by both machine and human auditory analysis, demonstrated by word error rates (WERs) all above 93.93% and our user study. Furthermore, our benchmark constructed for anti-deepfake and anti-content recovery evaluation helps provide a basis for future research in the realms of audio privacy preservation and deepfake detection.
git clone git@github.com:LetterLiGo/SafeEar.git
cd SafeEar/
conda create -n safeear python=3.9
conda activate safeear
python=3.9
, pytorch=1.13
, torchvision=0.14
.pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
#pip < 24.0 is needed
pip install -r requirements.txt
Before starting training, please modify the parameter configurations in configs
.
Use the following commands to start training:
python train.py --conf_dir config/train19.yaml
python train.py --conf_dir config/train21.yaml
To evaluate a model on one or more GPUs, specify the CUDA_VISIBLE_DEVICES
, dataset
, model
and checkpoint
:
python test.py --conf_dir Exps/ASVspoof19/conf.yml
python test.py --conf_dir Exps/ASVspoof21/conf.yml
If you find our work helpful, please consider citing:
@inproceedings{li2024safeear,
author = {Li, Xinfeng and Li, Kai and Zheng, Yifan and Yan, Chen and Ji, Xiaoyu, and Xu, Wenyuan},
title = {{SafeEar: Content Privacy-Preserving Audio Deepfake Detection}},
booktitle = {Proceedings of the 2024 {ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)}
year = {2024},
}