Worth noting that in your documentation, you mention GGML files with llama.cpp.
GGML is no longer supported in llama.cpp. However, you can load the exact same model as GGUF and it will work just fine.
I feel this needs to be clarified in your documentation so it's up to date. If you follow your docs and download the GGML version of the llm, it won't work.
Just modify that section of the readme to "load GGUF file", and the extension of GGUF files is .gguf, not .bin.
Simple fix but helpful to new users.
Worth noting that in your documentation, you mention GGML files with llama.cpp. GGML is no longer supported in llama.cpp. However, you can load the exact same model as GGUF and it will work just fine. I feel this needs to be clarified in your documentation so it's up to date. If you follow your docs and download the GGML version of the llm, it won't work.
Just modify that section of the readme to "load GGUF file", and the extension of GGUF files is .gguf, not .bin. Simple fix but helpful to new users.