issues
search
RockeyCoss
/
SPO
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step
https://arxiv.org/abs/2406.04314
137
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
About the training of step-aware preference model
#21
casiatao
opened
1 week ago
0
About the training speed
#20
casiatao
closed
1 week ago
1
About the training ratio win
#19
HongzhengYang
opened
2 weeks ago
0
The issue of the model crashing after training for one epoch.
#18
BarretBa
opened
3 weeks ago
0
The reproduction results of the released model.
#17
EvaFlower
closed
3 weeks ago
0
Good job!! Can you provide the preference model training code?
#16
moclimb
opened
1 month ago
1
Question about the validation result
#15
DwanZhang-AI
opened
1 month ago
0
Training step-aware scorer
#14
L-Justice1998
closed
1 month ago
1
Questions on Baseline
#13
Mowenyii
opened
1 month ago
1
Questions on validation
#12
G-U-N
closed
1 month ago
2
Do you have plans to open-source the weights of the D3PO and DDPO models fine-tuned on your 4,000 prompts from Pick-a-Pic V1?
#11
Mowenyii
closed
2 months ago
1
How to understand the training loss?
#10
EvaFlower
closed
2 months ago
5
Will the evaluation benchmark be released?
#9
Mowenyii
closed
2 months ago
1
When will release Step aware Preference Model training code?
#8
fhlt
closed
2 months ago
3
Lora or full parameter
#7
jiashenggu
closed
2 months ago
1
About training loss
#6
kjzju
closed
2 months ago
4
Out of memory using default config
#5
kjzju
closed
2 months ago
2
[Qustion] Why SDXL Lora has no effects in stable diffusion webui
#4
tristanwqy
closed
2 months ago
2
About reward model dataset or reward model
#3
jiashenggu
closed
3 months ago
3
Launching inference_spo_sdxl.py does not finish. "1Torch was not compiled with flash attention."
#2
MNeMoNiCuZ
closed
3 months ago
1
About 1.5
#1
ethanfel
closed
3 months ago
1