issues
search
uclaml
/
SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
https://uclaml.github.io/SPIN/
Apache License 2.0
1.05k
stars
92
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question about potential overfitting
#38
kang-0909
opened
1 month ago
0
Hello, I would like to ask, when you are training the model, do you only use the first round of dialogue from the ultrachat_200k?
#37
jackwwy
opened
4 months ago
1
Cannot reproduce the result
#36
Joyyang158
opened
4 months ago
4
numpy version warnings
#35
jiangjun0105
opened
4 months ago
1
OOM with 8 A800
#34
647sherry
opened
4 months ago
8
Difference between training / generation input format.
#33
hbin0701
opened
5 months ago
0
Cannot reproduce generated samples in UCLA-AGI/SPIN_iter0
#32
StarDewXXX
opened
6 months ago
3
Generate Result
#31
lss11005
opened
6 months ago
0
Theoretical Analysis and Idea of SPIN are quite weird (may not make senses)??
#30
AGTSAAA
opened
7 months ago
0
Question about using peft (LoRA)
#29
JasonJiaxiangLi
closed
7 months ago
1
Question about the checkpoint provided in this repo
#28
StarDewXXX
opened
8 months ago
3
support TULU-2 70B SPIN
#27
xujinlai
closed
7 months ago
0
SPIN == DPO in self-iteration?
#26
onebula
opened
8 months ago
6
Confused about iterations
#25
junkangwu
opened
8 months ago
4
Significant Performance Drop in GSM8k Evaluation with Updated SFT ckpt
#24
yinyueqin
opened
8 months ago
3
Have you tried combination of SPIN, SFT, DPO
#23
penolove
opened
8 months ago
0
Thesis discussion: Why can the end-to-end algorithm work properly?
#22
nomadlx
opened
8 months ago
5
GPU Memory question
#21
fangyuan-ksgk
opened
8 months ago
1
What changes should I make to apply the SPIN method on Llama2?
#20
Labmem009
opened
9 months ago
1
Potential reason of the significant improvement on the TruthfulQA and GSM8k
#19
NirViaje
opened
9 months ago
0
use_peft Not working?
#18
srn-source
closed
9 months ago
1
Clarify use of revision in SFT checkpoint
#17
lewtun
closed
9 months ago
0
the four reward metrics
#16
THBUer-yw
closed
7 months ago
2
Generate multiple samples with sampling?
#15
kmn1024
closed
7 months ago
1
Update setup.py
#14
amulil
closed
9 months ago
0
the logps decrease
#13
zhanghaoie
opened
9 months ago
2
Unable to reproduce performance
#12
guozhiyao
opened
9 months ago
10
Question about which datasets are used for each iteration
#11
lewtun
closed
9 months ago
4
The data num is wrong
#10
guozhiyao
closed
9 months ago
0
Some detailed questions regarding SPIN
#9
peterjc123
closed
9 months ago
4
Token indices sequence length is longer than the specified maximum sequence length
#8
yurunsheng1
closed
9 months ago
1
feat: implement vllm generate in MP GPU pool to parallelize the generating
#7
xujinlai
closed
9 months ago
3
vllm version
#6
zhaochenyang20
closed
9 months ago
2
Evaluation results on MT Bench and BBH
#5
ftmtk
closed
9 months ago
1
small bugfixes for vllm
#4
sumo43
closed
9 months ago
0
vllm generation issue.
#3
tdolan21
closed
9 months ago
2
use vllm for generation (10-20x speedup)
#2
sumo43
closed
9 months ago
0
Update trainer.py
#1
eltociear
closed
9 months ago
0