RockeyCoss SPO issues - Githubissues

RockeyCoss / SPO

Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

https://arxiv.org/abs/2406.04314

137 stars 3 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

About the training of step-aware preference model

#21 casiatao opened 1 week ago
0
About the training speed

#20 casiatao closed 1 week ago
1
About the training ratio win

#19 HongzhengYang opened 2 weeks ago
0
The issue of the model crashing after training for one epoch.

#18 BarretBa opened 3 weeks ago
0
The reproduction results of the released model.

#17 EvaFlower closed 3 weeks ago
0
Good job!! Can you provide the preference model training code?

#16 moclimb opened 1 month ago
1
Question about the validation result

#15 DwanZhang-AI opened 1 month ago
0
Training step-aware scorer

#14 L-Justice1998 closed 1 month ago
1
Questions on Baseline

#13 Mowenyii opened 1 month ago
1
Questions on validation

#12 G-U-N closed 1 month ago
2
Do you have plans to open-source the weights of the D3PO and DDPO models fine-tuned on your 4,000 prompts from Pick-a-Pic V1?

#11 Mowenyii closed 2 months ago
1
How to understand the training loss?

#10 EvaFlower closed 2 months ago
5
Will the evaluation benchmark be released?

#9 Mowenyii closed 2 months ago
1
When will release Step aware Preference Model training code？

#8 fhlt closed 2 months ago
3
Lora or full parameter

#7 jiashenggu closed 2 months ago
1
About training loss

#6 kjzju closed 2 months ago
4
Out of memory using default config

#5 kjzju closed 2 months ago
2
[Qustion] Why SDXL Lora has no effects in stable diffusion webui

#4 tristanwqy closed 2 months ago
2
About reward model dataset or reward model

#3 jiashenggu closed 3 months ago
3
Launching inference_spo_sdxl.py does not finish. "1Torch was not compiled with flash attention."

#2 MNeMoNiCuZ closed 3 months ago
1
About 1.5

#1 ethanfel closed 3 months ago
1