为啥Lora模型体积相差如此巨大

Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca

https://github.com/Facico/Chinese-Vicuna

Apache License 2.0

4.14k stars 422 forks source link

Open lucasjinreal opened 1 year ago

lucasjinreal commented 1 year ago

一个无关的问题：

我用该仓库配置保存的lora模型非常小（pytorch_model.bin）大概只有 17M。

但是我用Chinese-llama那个仓库的lora，体积非常大：

参数配置区别在于：


lora_rank 8
lora_alpha 32
modules_to_save "embed_tokens,lm_head"
```

是因为保存了额外参数的原因吗？那么这个参数是否是有必要的呢？

xiaochus commented 1 year ago

可以看LoRA的参数，这个repo对只对"q_proj", "v_proj"这两个类型的layer进行了finetune所以会小。增加的可训练的layer越多最后生成的模型就越大。