99991 / pygguf

GGUF parser in Python
MIT License
13 stars 2 forks source link

Attributed in huggingface/transformers #1

Open LysandreJik opened 2 months ago

LysandreJik commented 2 months ago

Hello!

FYI we've been using your code in order to offer support for gguf files within the python ecosystem, by offering the ability to load them within transformers.

We're doing so here, we've credited you in the documentation and I've added you as a co-author: https://github.com/LysandreJik/transformers/pull/2/files

We'll open a PR on the main fork in the coming days so I wanted to give you an opportunity to give it a look beforehand.

Thanks a lot for your work :hugs:

cc @younesbelkada

99991 commented 2 months ago

Very cool! I am glad that you found my code is useful!

But I am also a bit worried about potential bugs. I've only tested with tinyllama so far. It might totally break for any other model. For example, I am not sure about the transposed shapes.

In addition, I am not sure if this is the best way forward for the transformers library. Not having to add additional dependencies is certainly nice, but using NumPy is significantly slower than writing the bit wrangling code in C, because of all the copying from NumPy array to NumPy array.

99991 commented 2 months ago

Anyway, it might be nice to have a NumPy implementation to fall back on. For better completeness, I have implemented the missing quantization formats Q2_K, Q3_K and Q5_K. I have not implemented the other formats, since they are expected to be worse than the existing ones.

https://github.com/99991/pygguf/commit/a417edbfc029a1bc270f984a694f9128c5afa8b9