oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
38.46k stars 5.09k forks source link

Koala: text generation compatibility/support #1228

Closed alexl83 closed 1 year ago

alexl83 commented 1 year ago

Describe the bug

prompt for Koala https://bair.berkeley.edu/blog/2023/04/03/koala/ needs to be formatted very specifically https://github.com/young-geng/EasyLM/blob/main/docs/koala.md#koala-chatbot-prompts

As can been seen in the serving command above, the Koala chatbot requires a series of prompts to be prepended and appended to the user input in order to generate response correctly. Hence, to use the Koala weights in other frameworks, you will need to process the prompts accordingly.

The beginning of prompt BEGINNING OF CONVERSATION:is always prepended to every conversation. For each user input, the user prompt USER:is prepended to the user input, a space is appended to the user input and then the language model prompt GPT: is appended to the user input. This whole string will be used as prompt input to the language model for generating the response. For example, in the first round of conversation, when the user inputs Hello!, the whole prompt for generating the first response is:

BEGINNING OF CONVERSATION: USER: Hello! GPT:

After the language model generates the response, we append the response to the prompt and then append the EOS token </s> to the prompt. Suppose the language model generates the following response: Hi! How can I help you?, and for the next round, the user input is What is the largest animal on earth?. Then the whole prompt for generating the second response is:

BEGINNING OF CONVERSATION: USER: Hello! GPT:Hi! How can I help you?</s>USER: What is the largest animal on earth? GPT:

Note that due to the prompt and generated parts are tokenized separately, there's no space between the model prompt GPT: and the generated response.

Is there an existing issue for this?

Reproduction

I created my own Koala.yaml in characters/instruction-following

Koala.yaml

name: "GPT:"
your_name: "USER:\ "
context: "BEGINNING OF CONVERSATION:\ "

unfortunately, something is off: my wild assumption is that prompt concatenation as explained in https://github.com/young-geng/EasyLM/blob/main/docs/koala.md#koala-chatbot-prompts needs to be implemented

Koala is a multi-turn-tuned model, let's reproduce such a scenario

EasyLM inference

USER: Describe ten different open source AI libraries (and the languages they work with) that I can use to generate a summary of the main core contents of any web page, ignoring any ads or embedded materials.

GPT:1.NLTK (Natural Language Toolkit): Python
2.Gensim: Python
3.Latent Dirichlet Allocation (LDA): Python/Java
4.TextBlob: Python
5.Summarizer: Python
6.ANOVA: Python
7.Text-to-Image (T2I): Python/Java
8.Text-to-Speech (TTS): Python/Java
9.Text-to-Speech Synthesis (TSS): Python/Java
10.Text-to-Text Transfer (T2T): Python/Java

USER: Are any of these free?

GPT:Yes, some of these open-source AI libraries are free to use:
1.NLTK (Natural Language Toolkit): Python
2.TextBlob: Python
3.Summarizer: Pytho
4.ANOVA: Python
5.Text-to-Image (T2I): Python/Java
6.Text-to-Speech (TTS): Python/Java
7.Text-to-Speech Synthesis (TSS): Python/Java
8.Text-to-Text Transfer (T2T): Python/Java

Text-generation-webui inference

USER: Describe ten different open source AI libraries (and the languages they work with) that I can use to generate a summary of the main core contents of any web page, ignoring any ads or embedded materials.

GPT:
1. Summarizer - Python3.
2.TextBlob - Python
4.Gensim - Python
5.Latent Dirichlet Allocation (LDA) - Python
6.NLTK - Python
7.Allen Institute for Artificial Intelligence's GPT-2 model - Python
8.Google Cloud Natural Language API - Java/Python/Go
9.IBM Watson Studio - Python
9.OpenNMT - Python
10.Stanford CoreNLP - Python

USER: Are any of these free?

GPT:
#!/bin/bash
for file in "$(find -name "*.txt" | sort -u)"; do
filename=$(basename "$file")
echo "Processing $filename..."
# Perform text processing on the file here...
echo "Done with $filename."
done

This script uses find, basename, and echo commands to iterate over all regular files (.txt extension) that are unique, sorted by name. The output is a list of processed filenames, one per line.

Screenshot

No response

Logs

none, supposedly prompt concatenation as explained in https://github.com/young-geng/EasyLM/blob/main/docs/koala.md#koala-chatbot-prompts needs to be implemented (wild assumption)

System Info

ubuntu 22.04
textgen git
Ryzen 7700X
NVIDIA 4090
RAM 64Gb DDR5
oobabooga commented 1 year ago

I have added initial support here https://github.com/oobabooga/text-generation-webui/commit/cb95a2432c9506e494e018f2424f04b934bb678c

It seems to work at first glance

koala

alexl83 commented 1 year ago

Hi @oobabooga, thanks for your support!

it seems to be working in single-turn, I suggest adding model_type: 'llama' in config.yaml to avoid having to specify --model_type llama via command-line. I have a doubt about whitespaces: according to the makers of koala, BEGINNING OF CONVERSATION: and USER: should include a trailing whitespace. I don't know if textgen preserves those if present.

About multi-turn: do you see feasible to include model output as next turn input, to provide context and carry over dialogue? It's documented in their inference code/docs, and it works decently on their live demo

TURN 1: BEGINNING OF CONVERSATION: USER: Hello! GPT:Hi! How can I help you?</s>

TURN 2: BEGINNING OF CONVERSATION: USER: Hello! GPT:Hi! How can I help you?</s>USER: What is the largest animal on earth? GPT:

Thank you for your outstanding work!

alexl83 commented 1 year ago

EDIT: still not working, something's off: sending empty strings makes the model provide random answers

github-actions[bot] commented 1 year ago

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.