ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki
Apache License 2.0
18.23k stars 1.86k forks source link

Colab 最后量化为4bit时报错,4096*49954不能被256整除 #706

Closed Ziffer-byakuya closed 1 year ago

Ziffer-byakuya commented 1 year ago

提交前必须检查以下项目

问题类型

模型量化和部署

基础模型

Alpaca-Plus-7B

操作系统

None

详细描述问题

在运行这行代码时报错:

!cd llama.cpp && ./quantize ./zh-models/7B/ggml-model-f16.bin ./zh-models/7B/ggml-model-q4_K.bin q4_K

关于Colab的代码,我只修改了这里

!python ./Chinese-LLaMA-Alpaca/scripts/merge_llama_with_chinese_lora_low_mem.py \
    --base_model 'elinas/llama-7b-hf-transformers-4.29' \
    --lora_model 'ziqingyang/chinese-llama-plus-lora-7b', 'ziqingyang/chinese-alpaca-plus-lora-7b' \
    --output_type pth \
    --output_dir alpaca-combined

依赖情况(代码类问题务必提供)

# 请在此处粘贴依赖情况

运行日志或截图

main: build = 775 (d7d2e6a)
main: quantizing './zh-models/7B/ggml-model-f16.bin' to './zh-models/7B/ggml-model-q4_K.bin' as Q4_K
llama.cpp: loading model from ./zh-models/7B/ggml-model-f16.bin
llama.cpp: saving model to ./zh-models/7B/ggml-model-q4_K.bin

========================= Tensor sizes 4096 x 49954 are not divisible by 256
This is required to be able to use k-quants for now!
========================================================================================

llama_model_quantize: failed to quantize: Unsupported tensor size encountered

main: failed to quantize model from './zh-models/7B/ggml-model-f16.bin'
[   1/ 291]                tok_embeddings.weight -     4096 x 49954, type =    f16, 
ymcui commented 1 year ago

https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/llama.cpp量化部署#step-1-克隆和编译llamacpp

Ziffer-byakuya commented 1 year ago

非常感谢!已解决