Closed melodyliu1986 closed 1 month ago
@rhatdan @MichaelClifford
I made some mistakes in the previous PR #741, so I closed it. Is there anyway to delete 741?
I made changes according to your comments, please review it again.
LGTM
Great! Thanks for making those changes @melodyliu1986 Will re-review now.
I want to use the mistralai/Mistral-7B-Instruct-v0.2 model, and found there are no gguf files in HuggingFace, then I decided to use the ./convert_models functions to convert the model. I found there are some issues exist:
So I added the optional HF_TOKEN= parameter in the code. If the users want to download the public model, there is no token needed; If the users want to download the private model, they need to use the huggingface token;
Impacted files: README.md, download_huggingface.py, run.sh
If we go to https://github.com/ggerganov/llama.cpp.git, we can find the convert.py has been deprecated and moved to examples/convert_legacy_llama.py. I am not sure if I should just keep the line "python llama.cpp/convert-hf-to-gguf.py /opt/app-root/src/converter/converted_models/$hf_model_url", I just replace the convert.py with the correct path. also for llama.cpp/quantize
Impacted file: run.sh
So I added "converter" in the "podman run" command.
Here is my testing after the modification:
Here is the UI web testing with a public model(no token is needed)