I can't reproduce the results of the paper

SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

https://arxiv.org/abs/2311.12908

Apache License 2.0

220 stars 22 forks source link

I can't reproduce the results of the paper #11

Open Mowenyii opened 3 months ago

Mowenyii commented 3 months ago

Thank you for your impressive work.

I can't reproduce the results of DPO-SD 1.5.

We train on 8 NVIDIA A100 GPUs with a local batch size of 1 pair and gradient accumulation of 256 steps. Other experimental Settings are the same as those in the paper.

Here are some of the results I sampled during the training.

There is also something strange about the loss function during training. The training process took about nine hours.

Can you give me some advice?

DwanZhang-AI commented 1 month ago

Hi, Why local batch size should be 1? A bigger local batch size may cause a performance degradation?