-
### Feature Idea
I have tested the quantization of gguf on SD3 and Flux, and the results are great, smaller memory occupancy and faster speed, hope to support
### Existing Solutions
https://github.…
-
Can the nodes be changed so it does? The ComfyUI button to unload models seems to work, can that logic be reproduced in these nodes?
-
### System Info
Environment:
OS: Ubuntu 24.04
Python version: 3.11.8
Transformers version: transformers==4.45.2
Torch version: torch==2.3.0
Model: Meta-Llama-3.1-70B-Q2_K-GGUF - https://hugg…
-
### Feature request
I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Ar…
-
Issue during saving `unsloth/mistral-7b-instruct-v0.3-bnb-4bit` after training/saving, both in Kaggle and [gguf-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo)
I have tried converting…
-
Any chance 2bit models can be used with llama.cpp? Would be great to get LLama 3.1 (8B and 70B) converted to GGUF to try them out locally.
Thanks for the great research work!
-
I'm trying to run local llm with this model, where can I find the gguf file of this model on huggingface?
-
[GGUF](https://huggingface.co/docs/hub/en/gguf) is becoming a preferred means of distribution of FLUX fine-tunes.
Transformers recently added general support for GGUF and are slowly adding support …
-
Currently gguf-tools (as used in e.g. MLX - a popular choice on Apple platforms) cannot open `*.gguf` files on iOS, if those gguf files are in the app bundle. The root cause is it always seems to open…
-
It's all in the title:
https://huggingface.co/city96/stable-diffusion-3.5-large-gguf