Open a442509097 opened 1 month ago
Oh you need to use model.push_to_hub
and not model.save_pretrained
Oh you need to use
model.push_to_hub
and notmodel.save_pretrained
I downloaded it from Kaggle and manually uploaded it to Huggingface , The problem I am currently facing is that the RAM will overflow due to "model = AutoModel.from_pretrained(model_name)" . Perhaps I can also manually upload files to overwrite the files in the 'lora_model' of Colab, but I don't know what point 'model = Llama3', what point 'model = Lora', what point 'model = Llama3 + Lora'
Even if colab has enough running time, the disk will be full. I think text-generation-webui -> .gguf+ Lora is the fastest solution at the moment. 😅
main: quantize time = 221050.45 ms
main: total time = 221050.45 ms
Unsloth: Conversion completed! Output location: ./model-unsloth.Q8_0.gguf
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 6.72 out of 12.67 RAM for saving.
100%|██████████| 32/32 [00:48<00:00, 1.50s/it]
Unsloth: Saving tokenizer... Done.
Unsloth: Saving model... This might take 5 minutes for Llama-7b...
Unsloth: Saving a442509097/tempMode/pytorch_model-00001-of-00004.bin...
Unsloth: Saving a442509097/tempMode/pytorch_model-00002-of-00004.bin...
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 6.36 out of 12.67 RAM for saving.
0%| | 0/32 [00:01<?, ?it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization, _disable_byteorder_record)
627 with _open_zipfile_writer(f) as opened_zipfile:
--> 628 _save(obj, opened_zipfile, pickle_module, pickle_protocol, _disable_byteorder_record)
629 return
15 frames
RuntimeError: [enforce fail at inline_container.cc:764] . PytorchStreamWriter failed writing file data/22: file write failed
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
RuntimeError: [enforce fail at inline_container.cc:595] . unexpected pos 704676160 vs 704676048
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
RuntimeError: [enforce fail at inline_container.cc:764] . PytorchStreamWriter failed writing file data/0: file write failed
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in __exit__(self, *args)
473
474 def __exit__(self, *args) -> None:
--> 475 self.file_like.write_end_of_file()
476 if self.file_stream is not None:
477 self.file_stream.close()
RuntimeError: [enforce fail at inline_container.cc:595] . unexpected pos 576 vs 470
Wait even Colab runs out of disk space? Ye GGUF LoRA can work if that helps
My Colab has very limited runtime, So I used Kaggle to train Lora and uploaded it to Huggingface, then Colab load Lora form Huggingface
But when i use Colab prompt "Should have a
model_type
key in its config. json" , so i add "model_type": "llama" to config.json. then prompt "Your session crashed after using all available ROM." What step did I do wrong?