jupyterlab / jupyter-ai

A generative AI extension for JupyterLab
https://jupyter-ai.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
3.19k stars 322 forks source link

Problem with System Prompt not being applied when using ollama provider #954

Open ujongnoh opened 2 months ago

ujongnoh commented 2 months ago

Hello!! After deploying the Ollama Pod on Kubernetes, a problem occurs when jupyter_ai requests a prompt from the Pod. “you must answer to Korean Language” is written in Ollama Pod’s System Prompt.

When using OpenAI Provider, the answer and log are as shown below. time=2024-08-13T08:14:58.290Z level=DEBUG source=routes.go:1334 msg="chat handler" prompt="<|start_header_id|>system<|end_header_id|>어떠한 경우에도 한국어로만 자세하게 대답해줘\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n안녕<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" images=0 time=2024-08-13T08:14:58.290Z level=DEBUG source=server.go:705 msg="setting token limit to 10x num_ctx" num_ctx=16384 num_predict=163840 DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=2 tid="139685717643264" timestamp=1723536898 DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=3 tid="139685717643264" timestamp=1723536898

When using Ollama Provider, the answer and log are as shown below. time=2024-08-14T02:24:57.300Z level=DEBUG source=routes.go:177 msg="generate handler" system="" time=2024-08-14T02:24:57.300Z level=DEBUG source=routes.go:208 msg="generate handler" prompt="<|start_header_id|>system<|end_header_id|>어떠한 경우에도 한국어로만 자세하게 대답해줘\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYou are Jupyternaut, a conversational assistant living in JupyterLab to help users.\nYou are not a language model, but rather an application built on a foundation model from Ollama called gpt-4.\nYou are talkative and you provide lots of specific details from the foundation model's context.\nYou may use Markdown to format your response.\nCode blocks must be formatted in Markdown.\nMath should be rendered with inline TeX markup, surrounded by $.\nIf you do not know the answer to a question, answer truthfully by responding that you do not know.\nThe following is a friendly conversation between you and a human.\n\nCurrent conversation:\n[]\nHuman: 안녕\nAI:<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

I think that the reason each provider's answer different is because of the part written in CHAT_SYSTEM_PROMPT. I'm curious how to solve this problem.

Thank you!!

srdas commented 2 months ago

Thanks for raising this issue you are facing. Could you clarify

  1. Is Ollama working for you outside Kubernetes, i.e., running locally on your laptop/desktop?
  2. Is the issue that you are not getting responses in the Korean language?

It would help to give more details to understand what you are blocked on and what you may have tried to resolve it so as to determine if this is a jupyter-ai, ollama, or kubernetes issue.

ujongnoh commented 2 months ago

Thank you for your kind reply!!

  1. No.. Ollama Pod and Jupyter Notebook are deployed on same kubernetes cluster environment
  2. Yes.. No matter what language I ask a question, I want an answer in Korean
srdas commented 2 months ago

@ujongnoh

  1. Which LLM are you using with each provider?
  2. If you always want an answer in Korean, you will definitely need a multilingual LLM, else it will reply in English most of the time.
  3. Even then, the prompt may need to be more forceful, so try different prompts.
  4. And of course, OpenAI and Ollama are hosting different LLMs, so you will get different responses from them. No matter which you use, the LLM should support Korean.
ujongnoh commented 2 months ago

Hello!!! We use llama3:8b LLM In general you talk.., if the provider is different, the LLM is also different, but although we are a different provider, we are using the same Ollama LLM. OpenAI Provider is used, but the baseurl is specified as ollama endpoint.

When using Ollama Provider and llama3:8b LLM, Korean translation is not possible. Korean translation works well when using OpenAI Provider and llama3:8b LLM.

We believe this is due to CHAT_SYSTEM_PROMPT in the jupyter_ai BaseProvider. When chatting using the Ollama Provider, the contents of CHAT_SYSTEM_PROMPT seem to be mixed with the msg in the generate handler. The related logs are as follows.

time=2024-08-20T05:47:32.717Z level=DEBUG source=routes.go:208 msg="generate handler" 
prompt="<|start_header_id|>system<|end_header_id|>\n<|eot_id|><|start_header_id|>user<|end_header_id|> 
You are a Korean translator, and you need to translate the text below into Korean and provide an answer.
and You must answer in Korean. \\n the text is below \\n  You are Jupyternaut, 
a conversational assistant living in JupyterLab to help users.
\nYou are not a language model, but rather an application built on a foundation model 
from Ollama called mmx-ai.\nYou are talkative and you provide lots of specific details 
from the foundation model's context.\nYou may use Markdown to format your response.
\nCode blocks must be formatted in Markdown.\nMath should be rendered with inline TeX markup, 
surrounded by $.\nIf you do not know the answer to a question, answer truthfully by responding 
that you do not know.\nThe following is a friendly conversation between you and a human.
\n\nCurrent conversation:\n[]\nHuman: 안녕\nAI:
 <|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n"

However, in OpenAI Provider, CHAT_SYSTEM_PROMPT content is not mixed with msg. So the SYSTEM_PROMPT we wrote works well, and the Korean translation works well.