Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.15k stars 428 forks source link

多轮对话导致OOM #206

Open hongshuo-wang opened 1 year ago

hongshuo-wang commented 1 year ago

使用Chinese-Vicuna-lora-13b-belle-and-guanaco和llama-13b-hf运行interaction.sh,经过多轮对话后,显存直接爆掉了,请问是我哪里的配置有问题吗 GPU:4090 24G 系统:ubuntu22.04

Facico commented 1 year ago

多轮对话,上下文会越拼接越长,显存消耗会逐渐变大,要把max_len设置好