GGUF support - Githubissues

xenova / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!

https://huggingface.co/docs/transformers.js

Apache License 2.0

9.82k stars 579 forks source link

Open Mihaiii opened 1 month ago

Mihaiii commented 1 month ago

Right now transformers.js works with ONNX models. It would be useful to also support GGUF files (see llama.cpp)

Wider support + ONNX doesn't quantize below 8bit, but GGUF does.

I could help manual testing. Regarding the dev work, I'm unsure.