Closed monatis closed 1 year ago
I set this ready for review although it will require some tests with different models and fixes in memory allocation for text-only and vision-only model variants.
I'll do tests and make some code polishing over the coming days. Any test and feedback will be much appreciated btw.
I'm preparing the PR for merging later today. Added a shell script for bulk model conversion. Another script will do quantization in bulk. Then uploading models + updating readme and merging.
Python binding will require an update in its automatic model selection as now the smallest model in a repo is a text-only model, but I'll leave it to another PR.
Also updated the Python bindings to work with the new GGUF format and find pre-converted models uploaded to HF.
I think this is good to merge now.
file names are somewhat lenghty now, with "ggml-model" "ggml-text-model" and "ggml-vision-model". feels like the "ggml" here is redundant (.gguf).
yes, it might be {full|text|vision}-model-{ftype}.gguf
closes #60, #49, #32
This is still WIP and not yet for anything yet. I hope to finish it by Sunday.