piotrkawa / deepfake-whisper-features

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"
MIT License
72 stars 4 forks source link

About Learning Rate and Training Data #15

Open ndisci opened 1 month ago

ndisci commented 1 month ago

Hello,

Thanks for this nice work. I have some questions. Firstly when I used tensorboard to monitor training curves I realized that the learning rate didn't change. Why do you use constant learning rate instead of learning rate decay ? Is there any advantage to using constant learning rate ? When I take a look your paper , I can't see any explanation about this. I am trainig the specrnet model.

Second question is about spoof and bonafide data. How much data or how many hours spoof and bonafide data do you actually use ?

Thanks for your time.

piotrkawa commented 1 week ago

Hi, Yes - we did not use any LR scheduling technique. In the experiments, we focused on the front-ends and the differences between them. This way, we showed that simple change of the front-end from algorithmic ones (like MFCC or LFCC) to the Whisper features can improve generalization.

The results can be enhanced further by using scheduling techniques, data augmentation (e.g. RawBoost) or a larger dataset (we wanted the training procedure to be completed in less than 24 hours, so we used only ~100k samples).

To improve the model's results I would use larger Whisper models and larger (more diverse) datasets.

Best, Piotr

ndisci commented 1 week ago

Thank you so much :) For each classes, how many hours data do you actually use ?