unslothai / unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
12.37k stars 804 forks source link

Quantization error-(model.save_pretrained_gguf("model", tokenizer,)) #661

Open handsomechang114514 opened 2 weeks ago

handsomechang114514 commented 2 weeks ago

Error code:

RuntimeError Traceback (most recent call last) in <cell line: 1>() ----> 1 model.save_pretrained_gguf("model", tokenizer,)

1 frames /usr/local/lib/python3.10/dist-packages/unsloth/save.py in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, first_conversion, push_to_hub, token, private, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage) 1497 1498 # Save to GGUF -> 1499 all_file_locations = save_to_gguf(model_type, model_dtype, is_sentencepiece_model, 1500 new_save_directory, quantization_method, first_conversion, makefile, 1501 )

/usr/local/lib/python3.10/dist-packages/unsloth/save.py in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer) 1106 ) 1107 else: -> 1108 raise RuntimeError( 1109 "Unsloth: Quantization failed! You might have to compile llama.cpp yourself, then run this again.\n"\ 1110 "You do not need to close this Python program. Run the following commands in a new terminal:\n"\

RuntimeError: Unsloth: Quantization failed! You might have to compile llama.cpp yourself, then run this again. You do not need to close this Python program. Run the following commands in a new terminal: You must run this in the same folder as you're saving your model. git clone --recursive https://github.com/ggerganov/llama.cpp cd llama.cpp && make clean && make all -j Once that's done, redo the quantization.

Introduction:

I did my work on colab,I tried to run !pip uninstall unsloth -y !pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git at first,but it still didn't work.

Could anybody teach me how do I fix solve the problem?Thank for help very much.

danielhanchen commented 2 weeks ago

So Colab isn't work?

Another approach is to manually to GGUF: https://github.com/unslothai/unsloth/wiki#manually-saving-to-gguf

handsomechang114514 commented 2 weeks ago

it worked but at step for quantization is failed (after it made a F16 .GGUF model). like the error code I wrote before. I'm not sure how to fix it.

Another question: it said "RuntimeError: Unsloth: Quantization failed! You might have to compile llama.cpp yourself, then run this again." ,should I try this first? if yes,and how to do in colab?...I'm not really good at coding sry... since I'm still new.

there's the link of project btw:https://colab.research.google.com/drive/1KLonthuOh-XBJ5-fV-lkpqNcJjhUW_TW?usp=sharing