jackaduma Vicuna-LoRA-RLHF-PyTorch issues

jackaduma / Vicuna-LoRA-RLHF-PyTorch

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

MIT License

208 stars 18 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

changes and setup

#17 Ekanshjain55 opened 6 months ago
0
any plans for adding repo using stable vicuna for conversation .. human: assistant

#16 andysingal opened 1 year ago
0
how to evaluate?

#15 XuanRen4470 opened 1 year ago
0
unable to merge reward adapter into model

#14 XuanRen4470 opened 1 year ago
0
Can we another format than alpaca-instruct like alpaca-chat instruct format if yes how ?

#12 Tejaswi-kashyap-006 opened 1 year ago
0
SFT with large loss {'loss': 388082722196684.8, 'learning_rate': 0.0, 'epoch': 0.02}

#11 LeiShenVictoria opened 1 year ago
0
supervised_finetune.py failed with a wordaround

#10 SeekPoint opened 1 year ago
1
python train_reward_model.py failed

#9 SeekPoint opened 1 year ago
0
请问如何在training reward model中自定义数据集

#8 authurlord opened 1 year ago
0
不能理解为什么注释这行代码？

#7 apachemycat opened 1 year ago
0
CUDA out of memory

#6 integrum-aiktuck opened 1 year ago
0
Unable to merge reward adapter into model

#5 DavidFarago opened 1 year ago
2
What is the data format to LoRA-fine-tune Vicuna?

#4 DavidFarago opened 1 year ago
0
大神和原版vicuna仓库对比过效果吗？

#3 magneter opened 1 year ago
1
跑最后一步报这个警告，要怎么改超参数呢

#2 greatheart1000 opened 1 year ago
3
Does it really work ob RTX2080Ti ?

#1 GuofaHuang opened 1 year ago
1