Using the llama.cpp server currently uses the /chat/completion endpoint, preventing the selection of the appropriate prompt formatting template. This can lead to lower-than-expected scores when the standard ChatGPT template is applied. This pull request allows for the selection of both the llama.cpp server API and a specific template from the config file.
Using the llama.cpp server currently uses the /chat/completion endpoint, preventing the selection of the appropriate prompt formatting template. This can lead to lower-than-expected scores when the standard ChatGPT template is applied. This pull request allows for the selection of both the llama.cpp server API and a specific template from the config file.