turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.23k stars 238 forks source link

`compile_model` failed with "TypeError: unsupported operand type(s) for |=: 'dict' and 'dict'" #294

Closed gabinguo closed 5 months ago

gabinguo commented 5 months ago

Problem when running script convert.py for mistral-instruct

Script

python3 ./convert.py \
    -i ./shared_space/checkpoints/mixtral-instruct-v0.1/ \
    -o ./shared_space/temp/exl2/ \
    -cf ./shared_space/checkpoints/mixtral-instruct-v0.1-exl2-2.6bpw/ \
    -b 2.6

output log

 -- Linear: lm_head -> 0.15:8b_128g/0.85:6b_128g s4, 6.34 bpw
 -- Module quantized, calibration perplexity (quant): 7.2424
 -- Compiling output file...
Traceback (most recent call last):
  File "./convert.py", line 257, in <module>
    compile_model(job, save_job, model)
  File ".../exllamav2/conversion/compile.py", line 58, in compile_model
    d = get_f_module(job, module); out_dict |= d; current_size += _dsize(d)
TypeError: unsupported operand type(s) for |=: 'dict' and 'dict'

Any suggestion?


BTW, is the options that I used for converting mixtral correct? I simply used the example you put in README.md..

Thanks for your help : )

turboderp commented 5 months ago

The cmdline looks correct for converting mixtral.

The error is due to the |= operator not being supported in Python 3.8, so compile_model just shouldn't be using it. I've pushed a commit to use the update function instead which should be 3.8 compatible. If you update you should be able to resume the convert job from where it failed by running the exact same command again.

gabinguo commented 5 months ago

Thanks for the quick fix, works like a charm! 👍 👍 👍

For people who try to quantize Mixtral, I was using NVIDIA TITAN RTX 24GB. No Vram problem encountered.

Your work is amazing as always : )