Open yunlong10 opened 1 month ago
Sorry for late reply.
For the DHF1k dataset, it's fine to just use it for pre-training, and instead of focusing on its performance, we just train for about 20 epoches and stopped.
The training is then performed on the audio-visual dataset, and the pre-training weights for DHF1k must be remembered to be loaded, otherwise the performance will not be achieved.
Thank you for the excellent work! But I'm having difficulty reproducing the results on DHF1k using diff-sal.
I’ve downloaded the pre-trained checkpoint on DHF1k provided in this repository, but I’m unable to achieve the scores reported in the paper. I’ve tried training from scratch using the provided configurations but still haven't succeeded.
Could you kindly offer some guidance on how to proceed?