h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.47k stars 1.26k forks source link

Expert Tab Prompt Control Parameters #1257

Closed syllith closed 11 months ago

syllith commented 11 months ago

Hello, I'm sorry for this seemingly basic question, but for the life of me, I cannot understand the prompt control parameters and what they mean. Is there any documentation that gives a more detailed explanation of the following settings? I can see there's a brief description on the Gradio interface, but that's about it. For example, for Query Prompt, it just says "Added after documents", but what does that mean? How does this differ from the system prompt? I'm thoroughly confused and after an hour of searching, I still feel lost. Could you please help shed some light on this for me?

System Prompt Type System Prompt System Pre-Context Pre-Conversation Input for instruct prompt types Query Pre-Prompt Query Prompt Summary Pre-Prompt Summary Prompt HYDE LLM Prompt

tungsten-antidote commented 11 months ago

Hello, I think it's this way:

Different models and different use cases have different prompt structures. In the "Chat" tab you have somewhere three options for three simple use cases: "Action" with the entries "Query", "Summarize", and "Extract". Depending on the use case, the useful prompt parts (the content of those fields you asked about) are sent together with your "visible" prompt - before and after.

One "soft" use case within "Query" is ICL (In-context Learning), where you give examples as context for your actual request. An extension of ICL (and a focal point of H20 GPT) is RAG (Retrieval-augmented generation), where the context for the query is retrieved from relevant chunks of uploaded documents. Those document chunks are refered to as "documents".

A typical prompt for a RAG use case would consist of those three prompt parts:

Your "Task" part would be entered in "Query Pre-Prompt", because it is "Added before documents". Your "Request" part of the prompt would be entered in "Query Prompt", because it is "Added after documents".

"Summary Pre-Prompt" is meant for the "Summarize" use case option (with RAG). "Summary Prompt" is meant for the "Summarize" use case option (without RAG, without considering your own documents).

HYDE is a different prompt approach to get better results, but I don't know how to activate it in H20GPT. https://medium.com/prompt-engineering/hyde-revolutionising-search-with-hypothetical-document-embeddings-3474df795af8 System prompt is a prompt part that is added to all of your prompts (systematically).

pseudotensor commented 11 months ago

Thanks!

I'll update this to include some of the expert options.

https://github.com/h2oai/h2ogpt/blob/main/docs/README_ui.md#expert-tab

The expert options change alot more than other things, so I can't promise to document all items or keep the docs fully up to date, but I understand which ones could have some help and/or longer info strings in the UI itself.

syllith commented 11 months ago

Great stuff, thank you both. I have been doing AI research all day as I'm new to the field, but we're basically wanting to query internal documents for my work and allow our technicians to ask questions. The reason I was looking into these prompts is because on occasion, it will start associating things together it shouldn't or answer questions in an odd way, and after seeing the amount of prompt options, I assumed we weren't giving it enough instruction of how we wanted it to respond.

Of course I can just enter these values in the h2o UI, but without know what these options meant, it didn't make much sense to me and I had no confidence in what I was doing. A revelation I (think) I had is that an LLM doesn't actually accept anything other than text. So for example, if I'm utilizing langchain to process the contents of an image, from my understanding, it would run the image through some specialized program that is designed to simply describe the image. Then, this description is effectively imported into the prompt, on top of what I wrote in the standard query input. So no matter what kind of document you're uploading, it's ultimately getting converted into a useful text format that the LLM can deal with. Is my understanding of this correct?

I still need some way to automatically populate these prompts on startup. I'm running h2o in a docker container and I launch it using this command:

sudo docker run -p 7860:7860 --rm --init \
  -v "${HOME}"/.cache:/workspace/.cache \
  -v "${HOME}"/save:/workspace/save \
  -v "${HOME}"/user_path:/workspace/user_path \
  -v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData \
  -v "${HOME}"/users:/workspace/users \
  -v "${HOME}"/db_nonusers:/workspace/db_nonusers \
  -v "${HOME}"/llamacpp_path:/workspace/llamacpp_path \
  gcr.io/vorvan/h2oai/h2ogpt-runtime:0.1.0 /workspace/generate.py \
    --use_safetensors=True \
    --prompt_type=zephyr \
    --save_dir='/workspace/save/' \
    --user_path /workspace/user_path \
    --system_prompt "$(cat prompt.txt)" \
    --base_model=stabilityai/stablelm-zephyr-3b \
    --max_quality=True \
    --langchain_mode="UserData"

As you can see, I'm setting the system prompt by reading a prompt.txt file. This way I can easily tweak it and just relaunch the docker container. However, the trouble I'm having is when I try to set the other prompts, so they're automatically populated as well. I couldn't find any argument that would directly set some of these other prompts. Based on what was said, it sounds like I'm interested in changing Query Pre Prompt and Query Prompt. Can anyone help me understand how to set the prompts as an argument, so they're already populated when I start the container?

I have one another related problem that I'd like to know more about. I have a python program that utilizes the grclient.py file to send queries. Here's a snippet of it:

if __name__ == "__main__":
instruction = sys.argv[1]
chat_id = sys.argv[2]

system_prompt = read_prompt_from_file("prompt.txt")
chat_history = get_chat_history(chat_id)

kwargs = {
    'system_prompt': system_prompt,
    'langchain_mode': "UserData",
    'document_subset': "Relevant",
    'chat_conversation': chat_history
}

client = GradioClient("http://localhost:7860/")
response = client.query(instruction, **kwargs)

update_chat_history(chat_id, instruction, response)
print(response)

As you can see here, I am defining the system prompt, which gets pulled from the same prompt.txt file that the docker command pulls from. This way they have the same prompt, however, if I already have the docker container launched with the correct system prompt, do I need to supply it as a kwarg still? I'm not sure if the fact that I launched the docker container with all the settings I wanted, will the grclient.py file still obey these values set by docker? Or is the grclient.py file a separate "instance"? If I recall correctly, I had to set the document_subset and langchain_mode in order for this program to work the same as the h2o UI, so I would assume the UI settings don't carry over?

Same as the docker container arguments, I cannot find any option for setting the Query Pre Prompt and Query Prompt in the grclient.py file. Am I able to set them as kwags as well?

I tired to run the docker container with this command as a test, and I didn't see anything update in the expert tab, but in the models tab, I did see them in the "Current or Custom Model Prompt" section. Is this the same as setting them in the Expert tab and still apply these prompts to my query?

sudo docker run -p 7860:7860 --rm --init \
  -v "${HOME}"/.cache:/workspace/.cache \
  -v "${HOME}"/save:/workspace/save \
  -v "${HOME}"/user_path:/workspace/user_path \
  -v "${HOME}"/db_dir_UserData:/workspace/db_dir_UserData \
  -v "${HOME}"/users:/workspace/users \
  -v "${HOME}"/db_nonusers:/workspace/db_nonusers \
  -v "${HOME}"/llamacpp_path:/workspace/llamacpp_path \
  gcr.io/vorvan/h2oai/h2ogpt-runtime:0.1.0 /workspace/generate.py \
    --use_safetensors=True \
    --prompt_type=custom \
    --prompt_dict="{'promptA': 'value1', 'promptB': 'value2', 'PreInstruct': 'value3', 'PreInput': 'value4', 'PreResponse': 'value5', 'terminate_response': 'value6', 'chat_sep': 'value7', 'chat_turn_sep': 'value8', 'humanstr': 'value9', 'botstr': 'value10'}" \
    --save_dir='/workspace/save/' \
    --user_path /workspace/user_path \
    --system_prompt "$(cat prompt.txt)" \
    --base_model=stabilityai/stablelm-zephyr-3b \
    --max_quality=True \
    --langchain_mode="UserData"

So sorry for all the questions, but these are things I've tried to look up but either didn't find anything or I simply didn't understand it. I really appreciate the assistance.

pseudotensor commented 11 months ago

An example of setting all the prompts is here:

https://github.com/h2oai/h2ogpt/blob/3db5ad906a961f5b5e76bacc388f9461ba042711/docs/FAQ.md#non-english-languages

pseudotensor commented 11 months ago

For how the prompt templates look like, see:

https://github.com/h2oai/h2ogpt/blob/f505c8f8d74efbc754dedbfbfcfd16ad5d9fbeef/src/gen.py#L926-L945