I'm running an OpenAI-compatible llama-cpp-python server, the model is llama-2-13b-chat.Q8_0.gguf. This has been set as the default model via llm models default. And configured to llm via extra-openai-models.yaml. The model is exposed as localhost:8080/v1 REST API. A standard procedure with llm and llama-cpp-python.
Clipea works pretty well with it out of the box, but the Execute [y/N] step somehow gets omitted. After the LLM output is ready, we're back to shell prompt.
To fix this, it may require underlying llm improvements for how it handles the combo of local models, a custom template for it, and how it's imported as part of extra-openai-models.yaml. The current .yaml format does not support defining prompt templates. There may be another way to do it, but I haven't yet found it.
Pending that, having transparent access to the prompts that clipea itself sends to the LLM might provide the necessary tools to inject the correct templates as part of the flow. I might take a look at the source code at some point.
This is not so critical that it would be urgent IMO. Perhaps transparent internal prompt visibility could be part of your planned Python rewrite? A config file would be fine for this, no need for CLI options to set them.
I'm running an OpenAI-compatible llama-cpp-python server, the model is llama-2-13b-chat.Q8_0.gguf. This has been set as the default model via
llm models default
. And configured to llm via extra-openai-models.yaml. The model is exposed as localhost:8080/v1 REST API. A standard procedure with llm and llama-cpp-python.Server: https://llama-cpp-python.readthedocs.io/en/latest/#web-server LLM as client: https://llm.datasette.io/en/stable/other-models.html#openai-compatible-models
Clipea works pretty well with it out of the box, but the
Execute [y/N]
step somehow gets omitted. After the LLM output is ready, we're back to shell prompt.To fix this, it may require underlying
llm
improvements for how it handles the combo of local models, a custom template for it, and how it's imported as part of extra-openai-models.yaml. The current .yaml format does not support defining prompt templates. There may be another way to do it, but I haven't yet found it.Pending that, having transparent access to the prompts that clipea itself sends to the LLM might provide the necessary tools to inject the correct templates as part of the flow. I might take a look at the source code at some point.
This is not so critical that it would be urgent IMO. Perhaps transparent internal prompt visibility could be part of your planned Python rewrite? A config file would be fine for this, no need for CLI options to set them.