kuvaus / LlamaGPTJ-chat

Simple chat program for LLaMa, GPT-J, and MPT models.
MIT License
216 stars 50 forks source link

feat: Better template formatting options. #12

Closed saul-jb closed 1 year ago

saul-jb commented 1 year ago

It would be nice to have more formatting options like being able to wrap the output in certain tokens to assist with reading from stdout.

The following is an example format, altering the users input and the models output:

### Instruction:
The prompt below is a question to answer, write an appropriate response.
### Prompt:
How many words does the sentence '%1' have?
### Response:
<START>%2<FINISH>

This sort of behavior may be possible asking the model to format it in a specific way but that is error prone and may not work on all models.

Furthermore, it would be nice to have more information about the template format and how it is parsed and what parts are important. (E.g. are the # symbols before the headers important? Can they be replaced with other symbols or removed?)

kuvaus commented 1 year ago

This is a good feature request. Thanks.

I haven't found a way to explain the template formatting in a simple way; even the code that reads the template is a bit messy... But let's try:

I thought about the flexibility a little when making the template function. Like you said, most templates are very rigid but since its just "text" the user should be able to make any template they want.

The simplest way I could think of was just that the template does not care how the lines are formatted. Only the %1 (user input) line is important. Now, it turned out that parsing a text file that could be of any length and have whatever characters is quite tricky :), so I separated it to prefix, header, input, and footer lines like this:

default_prefix
default_prefix
default_prefix
default_header
%1
default_footer
default_footer
default_footer
default_footer

And what it gives to the model as prompt is: (default_prefix + default_header + input + default_footer)

default_prefix is basically the "instruction" lines at the start. You can describe whatever there and you can have as many lines as you like. default_header is the "prompt:" call, or you could have the user have a name here. Only 1 line, but of any length. Then there is the %1 line that is the user input line. default_footer is the response, so you can say "response:" for instruction-type AIs or for example you can name your AI. You can have as many footer lines as you like.

For example, this template should also work:

This is a conversation between Alice and Bob.
Alice is an all-knowing AI that is always right.
She is especially good at counting words in sentences.
Bob: How many words the following sentence has?
%1
Alice:

Now the AI responds as Alice would in a conversation between Alice and Bob (user).

Some models (I believe those are called instruction-type models) like the words "Instruction:", "Prompt:" and "Response:" and/or the # symbols, because they were trained on using them. The template does not impose any restrictions on the words or the use of # symbols. But with models you just have to experiment and see what works.

I hope this helped. :)

kuvaus commented 1 year ago

Wrapping the response is a good idea. :)

I need to think if there is a nice and simple way to do it without changing the response function. Right now it just dumps the response on stdout.

saul-jb commented 1 year ago

I need to think if there is a nice and simple way to do it without changing the response function. Right now it just dumps the response on stdout.

Perhaps this is common enough to just add it to the CLI parameters.

The simplest way I could think of was just that the template does not care how the lines are formatted. Only the %1 (user input) line is important.

I thought this might be the case but with LLMs it is a bit hard to test and know for sure. A couple follow ups: If the template is given to the LLM as is there should be no issues with reducing the template to just %1 allowing the prompter to specify the template with every prompt? If dynamic templates can't be done this way then could there be a way to load templates in and out perhaps in a similar fashion to /save & /load?

kuvaus commented 1 year ago

Perhaps this is common enough to just add it to the CLI parameters.

Yup. Now in v0.2.5 with --b_token and --e_token.

The simplest way I could think of was just that the template does not care how the lines are formatted. Only the %1 (user input) line is important.

I thought this might be the case but with LLMs it is a bit hard to test and know for sure. A couple follow ups: If the template is given to the LLM as is there should be no issues with reducing the template to just %1 allowing the prompter to specify the template with every prompt?

Yeah, should be possible. You might need to add some empty newlines in case the loading function happens to give errors:


%1

And then just have a way more elaborate prompt.

If dynamic templates can't be done this way then could there be a way to load templates in and out perhaps in a similar fashion to /save & /load?

Good point. This is something I could maybe add in a future version.

saul-jb commented 1 year ago

Yup. Now in v0.2.5 with --b_token and --e_token.

Thanks, that makes things much easier.

Yeah, should be possible. You might need to add some empty newlines in case the loading function happens to give errors:

Thanks, I was missing the newlines.

If dynamic templates can't be done this way then could there be a way to load templates in and out perhaps in a similar fashion to /save & /load?

Good point. This is something I could maybe add in a future version.

Since empty templates work this is probably not that important since users can wrap the program and provide their own system for changing templates.

Thanks heaps for all your fast work, this now has all the required functionality to do anything I can think of for now, so I'm closing this issue.