请问13b模型具体是怎么微调的？用的什么参数和数据？还是直接用的finetune.sh么？

Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

https://github.com/Facico/Chinese-Vicuna

Apache License 2.0

4.14k stars 421 forks source link

Closed jzsbioinfo closed 1 year ago

jzsbioinfo commented 1 year ago

我自己用13b然后lora微调，效果没有你展示的问答结果好。所以，想问下你13b微调具体的配置。

LZY-the-boys commented 1 year ago

我们是直接使用的finetune.py, 在一张3090上微调了200h (3个epoch)，问答的效果受生成的参数的影响比较大，可以自己调一下包括beam number在内的参数。