Open Mihaiii opened 1 month ago
Right now transformers.js works with ONNX models. It would be useful to also support GGUF files (see llama.cpp)
Wider support + ONNX doesn't quantize below 8bit, but GGUF does.
I could help manual testing. Regarding the dev work, I'm unsure.
Feature request
Right now transformers.js works with ONNX models. It would be useful to also support GGUF files (see llama.cpp)
Motivation
Wider support + ONNX doesn't quantize below 8bit, but GGUF does.
Your contribution
I could help manual testing. Regarding the dev work, I'm unsure.