uclaml SPIN issues - Githubissues

uclaml / SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

https://uclaml.github.io/SPIN/

Apache License 2.0

1.05k stars 92 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Question about potential overfitting

#38 kang-0909 opened 1 month ago
0
Hello, I would like to ask, when you are training the model, do you only use the first round of dialogue from the ultrachat_200k?

#37 jackwwy opened 4 months ago
1
Cannot reproduce the result

#36 Joyyang158 opened 4 months ago
4
numpy version warnings

#35 jiangjun0105 opened 4 months ago
1
OOM with 8 A800

#34 647sherry opened 4 months ago
8
Difference between training / generation input format.

#33 hbin0701 opened 5 months ago
0
Cannot reproduce generated samples in UCLA-AGI/SPIN_iter0

#32 StarDewXXX opened 6 months ago
3
Generate Result

#31 lss11005 opened 6 months ago
0
Theoretical Analysis and Idea of SPIN are quite weird (may not make senses)??

#30 AGTSAAA opened 7 months ago
0
Question about using peft (LoRA)

#29 JasonJiaxiangLi closed 7 months ago
1
Question about the checkpoint provided in this repo

#28 StarDewXXX opened 8 months ago
3
support TULU-2 70B SPIN

#27 xujinlai closed 7 months ago
0
SPIN == DPO in self-iteration?

#26 onebula opened 8 months ago
6
Confused about iterations

#25 junkangwu opened 8 months ago
4
Significant Performance Drop in GSM8k Evaluation with Updated SFT ckpt

#24 yinyueqin opened 8 months ago
3
Have you tried combination of SPIN, SFT, DPO

#23 penolove opened 8 months ago
0
Thesis discussion: Why can the end-to-end algorithm work properly?

#22 nomadlx opened 8 months ago
5
GPU Memory question

#21 fangyuan-ksgk opened 8 months ago
1
What changes should I make to apply the SPIN method on Llama2?

#20 Labmem009 opened 9 months ago
1
Potential reason of the significant improvement on the TruthfulQA and GSM8k

#19 NirViaje opened 9 months ago
0
use_peft Not working?

#18 srn-source closed 9 months ago
1
Clarify use of revision in SFT checkpoint

#17 lewtun closed 9 months ago
0
the four reward metrics

#16 THBUer-yw closed 7 months ago
2
Generate multiple samples with sampling?

#15 kmn1024 closed 7 months ago
1
Update setup.py

#14 amulil closed 9 months ago
0
the logps decrease

#13 zhanghaoie opened 9 months ago
2
Unable to reproduce performance

#12 guozhiyao opened 9 months ago
10
Question about which datasets are used for each iteration

#11 lewtun closed 9 months ago
4
The data num is wrong

#10 guozhiyao closed 9 months ago
0
Some detailed questions regarding SPIN

#9 peterjc123 closed 9 months ago
4
Token indices sequence length is longer than the specified maximum sequence length

#8 yurunsheng1 closed 9 months ago
1
feat: implement vllm generate in MP GPU pool to parallelize the generating

#7 xujinlai closed 9 months ago
3
vllm version

#6 zhaochenyang20 closed 9 months ago
2
Evaluation results on MT Bench and BBH

#5 ftmtk closed 9 months ago
1
small bugfixes for vllm

#4 sumo43 closed 9 months ago
0
vllm generation issue.

#3 tdolan21 closed 9 months ago
2
use vllm for generation (10-20x speedup)

#2 sumo43 closed 9 months ago
0
Update trainer.py

#1 eltociear closed 9 months ago
0