Model conversion from HF to GGUF crashes due to lack of memory

microsoft / BitNet

Official inference framework for 1-bit LLMs

MIT License

11.39k stars 768 forks source link

Model conversion from HF to GGUF crashes due to lack of memory #120

Open philtomson opened 22 hours ago

philtomson commented 22 hours ago

This was for the 8B param model in the instructions - quickly ran through the 32GB RAM on my PC. Is there someplace that pre-converted GGUF versions of these models might reside so that this conversion wouldn't need to be done?

kth8 commented 13 hours ago

https://huggingface.co/brunopio/Llama3-8B-1.58-100B-tokens-GGUF