Closed philpax closed 10 months ago
I think having a migration tool for converting previous formats to GGUF and then removing support for other models might be the most maintainable solution. It might be too early to definitely call this but I think it's prudent to assume that the ecosystem will converge on GGUF as the preferred format soon.
I've been messing around cleaning up the Python scripts in llama.cpp (like the converters, Python side of GGUF) so if you need to pick someone's brain about GGUF stuff I might be able to help. I'm not a expert by any means.
Aye, I noticed you contributed the conversion script upstream; I'll definitely reach out if I have any questions about the specifics there.
Implements support for loading and saving GGUF support.
TODO:
Gguf
struct to a file.quantize
. (For extra points, make it multithreaded.)Metadata
map?llm
metadata values are used forllama
.expect
s.architecture
option and load entirely based on the architecture specified in the GGUF.Open questions:
Closes #365.