turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.74k stars 215 forks source link

ValueError: Unrecognized layer: lm_head.q_groups on a new install #313

Closed Fuckingnameless closed 7 months ago

Fuckingnameless commented 7 months ago

Traceback (most recent call last): File "/home/github/exllama/test_benchmark_inference.py", line 129, in model = timer("Load model", lambda: ExLlama(config)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/github/exllama/test_benchmark_inference.py", line 56, in timer ret = func() ^^^^^^ File "/home/github/exllama/test_benchmark_inference.py", line 129, in model = timer("Load model", lambda: ExLlama(config)) ^^^^^^^^^^^^^^^ File "/home/github/exllama/model.py", line 765, in init head_size += math.prod(shape) * _layer_dtype_size(key) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/github/exllama/model.py", line 716, in _layer_dtype_size raise ValueError("Unrecognized layer: " + key) ValueError: Unrecognized layer: lm_head.q_groups

i installed this way

git clone https://github.com/turboderp/exllama
cd exllama
pip install -r requirements.txt

what produces the error: python test_benchmark_inference.py -d /mnt/someexl2modelfolder -p -ppl

turboderp commented 7 months ago

EXL2 models aren't support in ExLlama. And honestly, there isn't much happening with this project anymore. You'll want ExLlamaV2 where the active development is.

Fuckingnameless commented 7 months ago

EXL2 models aren't support in ExLlama. And honestly, there isn't much happening with this project anymore. You'll want ExLlamaV2 where the active development is.

damn me too many tabs open xD