Closed chandlerbing65nm closed 6 months ago
Hi Chandler, Thank you very much for the question. For your first question, we actually have split the train and test datasets on WakeFake Dateset. For your second question, that is an excellent idea. We are currently working on it, and like your ideas, we are trying to expose to a more diverse set of vocoders during training.
I have some questions regarding the evaluation metrics and results presented in Sections 4.4 and 4.5.
Intra-dataset Evaluation (Section 4.4)
The paper reports a very low EER of 0.19% on the WaveFake dataset using the RawNet2 model.
Cross-dataset Evaluation (Section 4.5)
On the other hand, the EER significantly increased to 26.95% when the model trained on the LibriSeVoc dataset was tested on the WaveFake dataset. This suggests poor generalization to unseen data.