josebeo2016 / SCL-Deepfake-audio-detection

Synthesis speech detection based on vocoder signature extraction with Supervised Contrastive Learning
Apache License 2.0
13 stars 1 forks source link

Questions about the Output Results and EER Calculation Methods #1

Closed hgyuhaoyang closed 2 weeks ago

hgyuhaoyang commented 2 weeks ago

"Thank you for your project, which has been very helpful to newcomers with ADD. I have two questions: Question 1: Using the default settings (CUDA_VISIBLE_DEVICES=0 bash 03_eval.sh configs/conf-3-linear.yaml DATA/asvspoof_2019_supcon 128 out/model_weighted_CCE_100_128_1e-08/conf-3-linear.pth docs/la191.txt), I obtained the following results: LA_E_1000147.flac -2.9200141429901123 -0.05544184893369675 LA_E_1000273.flac -0.17481780052185059 -1.8301467895507812 LA_E_1000791.flac -0.021144447848200798 -3.866931438446045 LA_E_1000841.flac -0.0006036128615960479 -7.41294002532959 However, the results you provided in asvspoof2019_conf-3.txt are: LA_E_1000147.flac -0.08371932059526443 1 LA_E_1000273.flac -1.7473167181015015 0 LA_E_1000791.flac -3.5226402282714844 0 LA_E_1000841.flac -7.278572082519531 0 I would like to ask how to adjust the settings to achieve the results like yours.

Question 2: I have another question regarding the values in the second and third columns for LA_E_1000147.flac -2.9200141429901123 -0.05544184893369675 and LA_E_1000147.flac -0.08371932059526443 1. Could you please explain what these columns represent? I look forward to your response.

josebeo2016 commented 2 weeks ago

Sorry about the mismatch. Since I kept developing after I wrote the paper so that some output format have been changed comparing to the output I uploaded to this repo. For the old format (the docs/*.txt), the format: For the new format (the new code), the format:

Then for calculate the evaluation metric using Result.ipynb, you may need to change the code following this:

pred_df = pd.read_csv("docs/your_new_result.txt", sep=" ", header=None)
pred_df.columns = ["utt", "spoof","score"]

pred_df['utt'] = pred_df['utt'].apply(lambda x: x.split('.')[0])
pred_df.head
# merge eval_df and pred_df on utt
res_df = pd.merge(eval_df, pred_df, on='utt')

# compute EER
spoof_scores = res_df[res_df['label'] == 'spoof']['score']
bonafide_scores = res_df[res_df['label'] == 'bonafide']['score']
eer, threshold = compute_eer(bonafide_scores, spoof_scores)
print("EER: {:.4f}%, threshold: {:.4f}".format(eer*100, threshold))
res_df['pred'] = res_df['score'].apply(lambda x: 'spoof' if x < threshold else 'bonafide')
# confusion matrix
cm = confusion_matrix(res_df["label"], res_df["pred"], labels=["spoof","bonafide"])
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=["spoof","bonafide"])
disp.plot(cmap='Greens', values_format='g')
plt.title("docs/asvspoof2019_conf-3.txt")
# plt.savefig("figs/ori_assl_cm.png", dpi=300)
plt.show()
hgyuhaoyang commented 2 weeks ago

Thank you for providing the code. I can now output the EER and the threshold.