Execute [y/N] missing with llama-2-13b-chat

vividfog commented 1 year ago

I'm running an OpenAI-compatible llama-cpp-python server, the model is llama-2-13b-chat.Q8_0.gguf. This has been set as the default model via llm models default. And configured to llm via extra-openai-models.yaml. The model is exposed as localhost:8080/v1 REST API. A standard procedure with llm and llama-cpp-python.

Server: https://llama-cpp-python.readthedocs.io/en/latest/#web-server LLM as client: https://llm.datasette.io/en/stable/other-models.html#openai-compatible-models

Clipea works pretty well with it out of the box, but the Execute [y/N] step somehow gets omitted. After the LLM output is ready, we're back to shell prompt.

To fix this, it may require underlying llm improvements for how it handles the combo of local models, a custom template for it, and how it's imported as part of extra-openai-models.yaml. The current .yaml format does not support defining prompt templates. There may be another way to do it, but I haven't yet found it.

Pending that, having transparent access to the prompts that clipea itself sends to the LLM might provide the necessary tools to inject the correct templates as part of the flow. I might take a look at the source code at some point.

This is not so critical that it would be urgent IMO. Perhaps transparent internal prompt visibility could be part of your planned Python rewrite? A config file would be fine for this, no need for CLI options to set them.

dave1010 commented 11 months ago

@all-contributors please add @vividfog for userTesting, ideas

allcontributors[bot] commented 11 months ago

@dave1010

I've put up a pull request to add @vividfog! :tada:

dave1010 / clipea

Execute [y/N] missing with llama-2-13b-chat #11