danier97 / LDMVFI

[AAAI'2024] "LDMVFI: Video Frame Interpolation with Latent Diffusion Models", Duolikun Danier, Fan Zhang, David Bull
MIT License
138 stars 15 forks source link

Performance cannot be reproduced #22

Open hbp001 opened 3 months ago

hbp001 commented 3 months ago

I used your code to train the model using the config file you provided, but I can't reproduce the performance. I trained the model with all settings exactly the same as in the paper and the github code, and I don't know why I can't get the performance of the paper. When I tried to evaluate with the pre-trained model you provided, the performance was reproduced properly, so I know that there is no problem with my data. If possible, could you please provide me with the train logs you have? Or maybe you could give me some advice if I'm missing something.

Thank you

danier97 commented 3 months ago

Hi there,

Thank you for your interest in the work, and sorry to hear that the training didn't go well.

I checked the experiment files I could find, and there might be two things that I did but forgot to update/specify in the paper. Firstly, when training the autoencoder, I seemed to have started with a batch size of 4 for 18 epochs, then switched to a batch size of 10 and trained until epoch 38 as opposed to epoch 70 in the paper. And also, I'm not 100% sure but I might have trained the UNet for longer than 60 epochs (likely for around 70 epochs). These were probably because the 3090 GPU became available and I switched to that from a lower-end GPU to allow for bigger batch size, and forgot to update the paper... My apologies for the inconvenience caused, and I hope this helps!

Many thanks.