unslothai / unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
12.68k stars 824 forks source link

Quantization error - model.save_pretrained_gguf(new_model, tokenizer, quantization_method = "q4_k_m") #579

Open dynamite9999 opened 1 month ago

dynamite9999 commented 1 month ago

Hello All, I have been saving llama3 in gguf for weeks and was working fine. Only today, I started getting the error, I tried everything including the suggestion git clone and make clean / make all with the flags.

Any suggestions / hints to get past this issue, very much appreciated. Traceback (most recent call last): File "/home/d/hp/dev/syslog/syslog_scraper/t59_nie_func_data/t13.py", line 1276, in main() File "/home/d/hp/dev/syslog/syslog_scraper/t59_nie_func_data/t13.py", line 1231, in main model.save_pretrained_gguf(TRAINED_GGUF_MODEL, tokenizer, quantization_method = "q4_k_m") File "/home/d/.local/lib/python3.11/site-packages/unsloth/save.py", line 1340, in unsloth_save_pretrained_gguf file_location = save_to_gguf(model_type, new_save_directory, quantization_method, first_conversion, makefile) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/d/.local/lib/python3.11/site-packages/unsloth/save.py", line 964, in save_to_gguf raise RuntimeError( RuntimeError: Unsloth: Quantization failed for ./TRAINED_GGUF_MODEL-unsloth.F16.gguf You might have to compile llama.cpp yourself, then run this again. You do not need to close this Python program. Run the following commands in a new terminal: You must run this in the same folder as you're saving your model. git clone --recursive https://github.com/ggerganov/llama.cpp cd llama.cpp && make clean && LLAMA_CUDA=1 make all -j Once that's done, redo the quantization.

danielhanchen commented 1 month ago

@dynamite9999 Can you try reinstalling and updating Unsloth?

pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git
dynamite9999 commented 1 month ago

Yes! Thank you , it worked. Appreciate it.

On Sun, Jun 2, 2024 at 11:36 PM Daniel Han @.***> wrote:

@dynamite9999 https://github.com/dynamite9999 Can you try reinstalling and updating Unsloth?

pip uninstall unsloth -y pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

— Reply to this email directly, view it on GitHub https://github.com/unslothai/unsloth/issues/579#issuecomment-2144386739, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANVUTBAYISSOJ2KPJUEP6TLZFQFHVAVCNFSM6AAAAABIVLXSZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGM4DMNZTHE . You are receiving this because you were mentioned.Message ID: @.***>

danielhanchen commented 1 month ago

Great!

devzzzero commented 1 month ago

I think I just got bit by this as well. Is there a way to get unsloth to use a preinstalled version of llama.cpp? It's dying on llama.cpp compile, after a clean uninstall/reinstall of unsloth via directions above.

I got this error
p  -march=native -mtune=native -Wno-array-bounds -Wno-format-truncation -Wextra-semi -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_OPENMP -DGGML_USE_LLAMAFILE  -c common/grammar-parser.cpp -o grammar-parser.o
c++: error: unrecognized command line option ‘-Wextra-semi’; did you mean ‘-Wextra’?
c++: error: unrecognized command line option ‘-Wextra-semi’; did you mean ‘-Wextra’?
c++: error: unrecognized command line option ‘-Wextra-semi’; did you mean ‘-Wextra’?
make: *** [Makefile:766: llama.o] Error 1
make: *** Waiting for unfinished jobs....
make: *** [Makefile:769: common.o] Error 1
make: *** [Makefile:778: grammar-parser.o] Error 1
c++: error: unrecognized command line option ‘-Wextra-semi’; did you mean ‘-Wextra’?
make: *** [Makefile:772: sampling.o] Error 1
make: Leaving directory '/home/ai/LLM/PEFT/llama.cpp'
danielhanchen commented 1 month ago

Oh weird - what's your GCC version? There is a way to install a specific version of llama.cpp

devzzzero commented 1 month ago

Oh weird - what's your GCC version? There is a way to install a specific version of llama.cpp

gcc-13 (opensuse Leap 15.5).

I'm actually not too fussed about this actually.

I think having 27 different copies of llama.cpp across my 18 different conda envs, is just wasteful!

The work around is to manually convert the merged model to gguf (using a locally installed llama.cpp). Which works for me, for now :-)

Thank you! LOL!