ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.81k stars 9.29k forks source link

[User] covert.py thows KeyError: 'rms_norm_eps' on persimmon-8b-chat #3344

Closed VincentJGeisler closed 10 months ago

VincentJGeisler commented 11 months ago

python3 llama.cpp/convert.py persimmon-8b-chat --outfile persimmon-8b-chat.gguf --outtype q8_0

Expected Behavior

produce a gguf of persimmon 8b at q8_0

Current Behavior

Traceback (most recent call last): File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1208, in main() File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1157, in main params = Params.load(model_plus) File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 288, in load params = Params.loadHFTransformerJson(model_plus.model, hf_config_path) File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 208, in loadHFTransformerJson f_norm_eps = config["rms_norm_eps"] KeyError: 'rms_norm_eps'

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Linux ubuntu 22.04

$ lscpu

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 36 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 38 On-line CPU(s) list: 0-37 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz CPU family: 6 Model: 62 Thread(s) per core: 1 Core(s) per socket: 19 Socket(s): 2 Stepping: 4 CPU max MHz: 2800.0000 CPU min MHz: 0.0000 BogoMIPS: 5600.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse s se2 ss ht syscall nx rdtscp lm pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_d eadline_timer aes xsave osxsave avx f16c rdrand hypervisor lahf_lm fsgsbase tsc_adjust smep er ms ibrs ibpb stibp ssbd

$ uname -a


$ python3 --version Python 3.10.12 
$ make --version GNU Make 4.3
$ g++ --version 
```g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
# Failure Information (for bugs)

python3 llama.cpp/convert.py persimmon-8b-chat --outfile persimmon-8b-chat.gguf --outtype q8_0
Loading model file persimmon-8b-chat/pytorch_model-00001-of-00002.bin
Loading model file persimmon-8b-chat/pytorch_model-00001-of-00002.bin
Loading model file persimmon-8b-chat/pytorch_model-00002-of-00002.bin
Traceback (most recent call last):
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1208, in <module>
    main()
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1157, in main
    params = Params.load(model_plus)
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 288, in load
    params = Params.loadHFTransformerJson(model_plus.model, hf_config_path)
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 208, in loadHFTransformerJson
    f_norm_eps       = config["rms_norm_eps"]
KeyError: 'rms_norm_eps'

# Steps to Reproduce
git clone  http://huggingface.co/adept/persimmon-8b-chat

git clone https://github.com/ggerganov/llama.cpp.git
python3 -r llama.ccp/requirements.txt
python3 llama.cpp/convert.py persimmon-8b-chat --outfile persimmon-8b-chat.gguf --outtype q8_0
# Failure Logs

Loading model file persimmon-8b-chat/pytorch_model-00001-of-00002.bin
Loading model file persimmon-8b-chat/pytorch_model-00001-of-00002.bin
Loading model file persimmon-8b-chat/pytorch_model-00002-of-00002.bin
Traceback (most recent call last):
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1208, in <module>
    main()
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 1157, in main
    params = Params.load(model_plus)
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 288, in load
    params = Params.loadHFTransformerJson(model_plus.model, hf_config_path)
  File "/mnt/c/Users/admin/src/llama.cpp/convert.py", line 208, in loadHFTransformerJson
    f_norm_eps       = config["rms_norm_eps"]
KeyError: 'rms_norm_eps'
DavidGOrtega commented 10 months ago

I have the same issue with RedPajama-INCITE 3B

Galunid commented 10 months ago

I believe convert.py is for LLama based models only, thus neither Persimmon, nor RedPajama is supported. For converting persimmon use convert-persommon-to-gguf.py (you need to use .tar version from adept github, not the huggingface one). For RedPajama, the script is convert-gptneox-hf-to-gguf.py (although gptneox is not supported yet in llama.cpp).