SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"
https://arxiv.org/abs/2311.12908
Apache License 2.0
272 stars 24 forks source link

The loss becomes almost 0 after 1 epoch of training. #2

Open ChenDRAG opened 8 months ago

ChenDRAG commented 8 months ago

Hi, when reproducing your experiments, I found that experiments met a bug after 1 epoch of training.

image

Could you give any thoughts on why this might happen?

ShristiDasBiswas commented 6 months ago

were you able to solve this issue?

Mowenyii commented 6 months ago

Can updating the accelerator to the latest version resolve this issue?