Prompt template not honored for LlamaCpp mode provider

sundaraa-deshaw commented 10 months ago

Description

I am using the jupyter ai extension with a custom model provider as per steps in https://jupyter-ai.readthedocs.io/en/latest/users/index.html#custom-model-providers

However the custom prompt template is not being used.

Reproduce

Declare a custom ModelProvider like

class LLMProvider(BaseProvider, LlamaCpp):
    id = "llama_provider"
    name = "llama provider"
    model_id_key = "llama_provider"
    models = [
        "codellama-7B-Instruct",
    ]

    def __init__(self, **kwargs):
        super().__init__(
            model_id="desco_llm_provider",
            model_path="/path/to/model",
            temperature=0.2,
            max_tokens=128,
            top_p=0.0,
            verbose=True,  # Verbose is required to pass to the callback manager
            n_ctx=1024,
            top_k=1,
            n_gpu_layers=100,
            streaming=False
        )

    def get_prompt_template(self, format) -> PromptTemplate:
        print("format: {format}")
        if format == "code":
            return PromptTemplate.from_template(
                "{prompt}\n\nProduce output as source code only, "
                "with no text or explanation before or after it. "
                "Produce the output in Markdown format"
            )
        return super().get_prompt_template(format)

Install the provider as per the instructions
Restart Jupyter Lab and connect to the above model
Give a prompt input, e.g. write code to transpose a numpy matrix

Generated prompt:

> Entering new ConversationChain chain...
Prompt after formatting:
You are Jupyternaut, a conversational assistant living in JupyterLab to help users.
You are not a language model, but rather an application built on a foundation model from llama provider called llm_provider.
You are talkative and you provide lots of specific details from the foundation model's context.
You may use Markdown to format your response.
Code blocks must be formatted in Markdown.
Math should be rendered with inline TeX markup, surrounded by $.
If you do not know the answer to a question, answer truthfully by responding that you do not know.
The following is a friendly conversation between you and a human.

Current conversation:
Human: generate code to plot line chart from a dict with keys "date" and "price"
AI:  I can help you with this! Here's the code to plot a line chart from a dictionary with keys "date" and "price":
python
import pandas as pd

# create a dataframe from the dictionary
df = pd.DataFrame(data)

# convert date column to datetime format
df['date'] = pd.to_datetime(df['date']),

# plot line chart using seaborn library
sns.lineplot(x='date', y='price', data=df))

Human: write code to transpose a numpy matrix AI:


## Expected behavior

<!--Describe what you expected to happen-->
Expected prompt to be suffixed with :

"Produce output as source code only, with no text or explanation before or after it. Produce the output in Markdown format



## Context

<!--Complete the following for context, and add any other relevant context-->

- Operating System and version: Fedora Linux 8
- Browser and version: Chrome build 119.0.6045.160
- JupyterLab version: 4.0.9

welcome[bot] commented 10 months ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

sundaraa-deshaw commented 10 months ago

This behavior is observed when using the Jupyternaut chat.. The prompt template is honored when using the magic %%ai in a cell. However I find it inconsistent that it does not use the prompt template when used from the chat,

krassowski commented 10 months ago

Yes, as of now the prompt templates apply to the magic only. There are two issues tracking customisation of prompts for chat (although it is not obvious from the titles):

I think ultimately per-model prompts for inline completions (https://github.com/jupyterlab/jupyter-ai/pull/465) would come in handy too; these will need separate prompt for code and text/markdown generation.

triinity2221 commented 7 months ago

Hi @sundaraa-deshaw, I am also currently trying to setup connection between jupyter-ai and a local LLM. In my case the LlaMA2 sits on a local GPU server and the jupyter-ai is setup on a different development server. Please can you let me know if you have tried to build a similar setup and got some leads.

sundaraa-deshaw commented 7 months ago

Hi @sundaraa-deshaw, I am also currently trying to setup connection between jupyter-ai and a local LLM. In my case the LlaMA2 sits on a local GPU server and the jupyter-ai is setup on a different development server. Please can you let me know if you have tried to build a similar setup and got some leads.

Hi, I did something similar previously. i.e. run the llama engine on a GPU and expose the inference/chat completion as a server (you get this for free with llama.cpp/server) and then implement a local LLM provider that hits the completion endpoint. This worked for me as a PoC

jupyterlab / jupyter-ai

Prompt template not honored for LlamaCpp mode provider #491

Description

Reproduce