SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"
https://arxiv.org/abs/2311.12908
Apache License 2.0
199 stars 18 forks source link

The loss becomes almost 0 after 1 epoch of training. #2

Open ChenDRAG opened 4 months ago

ChenDRAG commented 4 months ago

Hi, when reproducing your experiments, I found that experiments met a bug after 1 epoch of training.

image

Could you give any thoughts on why this might happen?

ShristiDasBiswas commented 2 months ago

were you able to solve this issue?

Mowenyii commented 1 month ago

Can updating the accelerator to the latest version resolve this issue?