Cannot reproduce your paper results

hihiok commented 4 months ago

Hi! Thanks for your excellent work! I have tried to reproduce your code but results are around 2% worse than your reported results. I have used the same GPU and parameters as you did. I know this is common but would like to seek your kind advice on how to tune the parameters to get almost same results as yours? I tuned lr up and down but didn't help. Thank you!

moonbow721 commented 4 months ago

Hi! Thank you for reaching out and for your efforts in replicating the results. It's quite common to see slight variations in performance due to the stochastic nature of training processes. Based on what you've described, there are a couple of suggestions I'd recommend:

Early Convergence: If you've observed through your TensorBoard logs that the training converges quicker than expected, it might be beneficial to use checkpoints from the middle of your training process rather than the final one. Sometimes, models achieve better performance at these mid-training checkpoints before potentially overfitting or diverging.
Adjust Testing Hyperparameters: Since optimization methods can be quite sensitive to hyperparameters, consider adjusting not just the learning rate but also the number of iteration steps during testing. These tweaks can sometimes lead to considerable improvements as they might better align with the model's training dynamics.

I hope these suggestions help. Please feel free to reach out if you have any more questions or need further assistance!

hihiok commented 4 months ago

Hi! Thanks for your prompt reply. However,

I saved trained model every 5000 epochs and tested them all, none of them achieve better performance
I have tested on the pretrained model you provided, it can produce almost same (very slightly worse) performance as your paper reports. This means that my test hyperparameters should have no issue?

Seek your kind advice on any other means to deal with the problem. Thank you!

moonbow721 commented 4 months ago

Thank you for your patience and the detailed feedback!

I recommend comparing your training process with our training logs to ensure consistency in parameters and procedures. These logs outline the parameters used during our training sessions, which should align with those set in our repository, albeit with a greater number of iterations (120k vs 40k).

Regarding testing, it's good to hear the pretrained model aligns closely with our reported results. The slight variations in performance are likely due to the inherent randomness associated with the DPoser's noise sampling process. Such minor differences are common and usually do not significantly affect the model's performance.

If there are any further questions or if you require additional clarification, do not hesitate to contact me.

moonbow721 / DPoser

Cannot reproduce your paper results #2