Error when trying to quantize Llama3 70B instruct: module 'exllamav2_ext' has no attribute 'sim_anneal'

RodriMora commented 2 months ago

When trying to quantize llama 3 instruct from HF to exl2 I get this error:

-- Input: /home/ubuntu/text-generation-webui/models/meta-llama_Meta-Llama-3-70B-Instruct -- Output: /home/ubuntu/temp/exl2/ -- Using default calibration dataset -- Target bits per weight: 8.0 (decoder), 6 (head) -- Max shard size: 8192 MB -- Full model will be compiled to: /home/ubuntu/text-generation-webui/models/meta-llama_Meta-Llama-3-70B-Instruct_exl2_8.0bpw -- Optimizing... -- Optimizing: 1/ 240 Traceback (most recent call last): File "/home/ubuntu/exllamav2/convert.py", line 248, in optimize(job, savejob, model) File "/home/ubuntu/exllamav2/conversion/optimize.py", line 119, in optimize s, si, p, c, m = ext_c.sim_anneal(slots, ^^^^^^^^^^^^^^^^ AttributeError: module 'exllamav2_ext' has no attribute 'sim_anneal'

Using this command to quantize: python convert.py -i /home/ubuntu/text-generation-webui/models/meta-llama_Meta-Llama-3-70B-Instruct -o /home/ubuntu/temp/exl2/ -cf /home/ubuntu/text-generation-webui/models/meta-llama_Meta-Llama-3-70B-Instruct_exl2_8.0bpw -b 8.0

Using the latest version as per git pull of main branch

turboderp commented 2 months ago

You have an older version of the exllamav2 package installed, along with exllamav2_ext. You can either uninstall the package (pip uninstall exllamav2) and use the JIT version if you have the CUDA toolkit installed, or you can install the most recent prebuilt wheel from here.

RodriMora commented 2 months ago

You have an older version of the exllamav2 package installed, along with exllamav2_ext. You can either uninstall the package (pip uninstall exllamav2) and use the JIT version if you have the CUDA toolkit installed, or you can install the most recent prebuilt wheel from here.

you're correct. Thanks!

turboderp / exllamav2

Error when trying to quantize Llama3 70B instruct: module 'exllamav2_ext' has no attribute 'sim_anneal' #424