unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.76k stars 1.32k forks source link

error save gguf in google drive #841

Open dromeuf opened 4 months ago

dromeuf commented 4 months ago

I'm running a finetuning unsloth python code in a colab notebook without subscription and I want to save the gguf in my google drive space directly but the function returns an error. I want to do this because the notebook's hard disk capacity is limited. It works without problem for merged16bits or other but not for gguf.

RuntimeError: Unsloth: Quantization failed for .//content/drive/MyDrive/AI/ModelsTensorsWeights/Model_Tokenizer_Unsloth_Phi3mini4kI_merged_q4_k_m_fGGUF/unsloth.BF16.gguf
You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone --recursive https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && make all -j
Once that's done, redo the quantization.

It works without problem for merged16bits or other but not for gguf.

model.save_pretrained_merged("/content/drive/MyDrive/AI/ModelsTensorsWeights/Model_Tokenizer_Unsloth_Llama31_8Bbnb4b_merged_16b_fHF", tokenizer, save_method = "merged_16bit",) but for GGUF I use !mv command for move gguf file notebook to google drive :

  model.save_pretrained_gguf("./Model_Tokenizer_Unsloth_Llama31_8Bbnb4b_merged_f16_fGGUF", tokenizer, quantization_method = "f16")
  !mv -v ./Model_Tokenizer_Unsloth_Llama31_8Bbnb4b_merged_f16_fGGUF /content/drive/MyDrive/AI/ModelsTensorsWeights/

do you think this problem can be solved directly in unsloth ?

Thanks for your great work !

danielhanchen commented 4 months ago

Temporarily its best to manually save to GGUF sorry! See https://github.com/unslothai/unsloth/wiki#manually-saving-to-gguf