Vincent-ZHQ / MRDF

Code for Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection
Apache License 2.0
20 stars 2 forks source link

The performance of the model on the entire dataset #5

Open Liang-bk opened 7 months ago

Liang-bk commented 7 months ago

Hello, I notice that only 10% of the whole dataset is used in the code, does it mean that acc is also based on this part of the data? And I want to test the model on the whole dataset, but there are some problems happened while loading the ckpt model: Missing key(s) in state_dict: "video classifier.0.weight", "audio classifier.0.weight". Was that model structure conflicts in the code and ckpt model?

Vincent-ZHQ commented 7 months ago

Yes. We only use part of the original dataset. FakeAVCelebl is very imbalanced. So we randomly sample some samples and propose a five-fold cross-validation strategy for evaluation. For inference, you should ensure the checkpoint you use is consistent with the model you test.

Liang-bk commented 7 months ago

Thanks for answer, and the point I had missed in last question is that I download the ckpt model from the google drive in the readme.md, but I can't adapt it on the network in this code :(

Vincent-ZHQ commented 7 months ago

The provided chpts correspond to CE-based and Margin-based methods, respectively. The margin-based method has no audio and video classifier. I don't know about the details of your reported errors. I remember I made sure that the code could run before I uploaded.

Liang-bk commented 7 months ago

Thank you very much! I will try again.