Closed bdambrosio closed 2 months ago
cd ../../../exllamav2 export CUDA_VISIBLE_DEVICES=2 python3 convert.py -i ../models/llama3-70B-Instruct -o llama3-70B-Instruct-exl2 -cf llama3-70B-Instruct-exl2 -l 2048 -b 8.0 -hb 8 -ss 8192
This should be fixed in the dev branch. Once I'm done quantizing (and testing) all the 70B versions I'll release v0.0.19 with the fixes.
thanks! should have figured you had already spotted it!
cd ../../../exllamav2 export CUDA_VISIBLE_DEVICES=2 python3 convert.py -i ../models/llama3-70B-Instruct -o llama3-70B-Instruct-exl2 -cf llama3-70B-Instruct-exl2 -l 2048 -b 8.0 -hb 8 -ss 8192
Quantization error (2)