Chinese-Vicuna-lora-7b-belle-and-guanaco是基于merge.json的数据训练的吗？

Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

https://github.com/Facico/Chinese-Vicuna

Apache License 2.0

4.14k stars 422 forks source link

Closed greatewei closed 1 year ago

greatewei commented 1 year ago

我分析了一下数据，发现merge.json中sql相关的指令数据非常少，但是Chinese-Vicuna-lora-7b-belle-and-guanaco的编写sql能力还是不错的，这是什么原因导致的，而且merge.json的数据只有400M貌似有点少，但是模型效果还是不错的原因是什么？

Facico commented 1 year ago

因为llama本身就吃了很多数据，我们只是尽量让他对齐一下中文的能力。现在各种中文instruction的数据非常多了，比如belle2M的（我们之前就用了0.5M）的，可以用和我们类似数据接口试试