Open 3Simplex opened 2 months ago
The idea here is that the defaults we have for system prompt and chat template are in large part detrimental. Very few models will work well with these defaults.
Instead of parsing gguf files in order to install a new model and deduce the proper templates from that perhaps we should just require the user to fill out these templates by hand. If the templates aren't filled out, then the model might be 'installed' but it should not be 'enabled' or so on.
I wrote this for the situation.
https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models
I believe this title is now concise and can be expected to remain unchanged.
@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed.
Example of a "chat template" that might be found
"{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}
The pseudo code
{% set loop_messages = messages %}
{% for message in loop_messages %}
{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
{% if loop.index0 == 0 %}
{% set content = bos_token + content %}
{% endif %}
{{ content }}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{% endif %}
The prompt template I understand
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
%2
Yet it fails miserabily with answer prompt looping.
@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed. ...paraphrasing... Yet it fails miserably with answer prompt looping.
I see you are looking at a Jinja template. Thanks for reminding me, I just included that as an "Advanced Topic" along with how to make sure a model will work if it was not built correctly.
I can attempt to answer both of these questions here before I add them to the Wiki. (looks like you did well decoding the template)
Breaking down a Jinja template is fairly straight forward if you can follow a few rules.
You must keep the tokens as written in the jinja and strip out all of the other syntax etc. Also try to watch for mistakes here. Sometimes they fail to input a functional jinja template. The Jinja must have the following tokens:
roles
Sometimes they are combined into one like this <|user|> which indicates both a role and a beginning tag.
Let's start at the beginning of this Jinja.
> {% set loop_messages = messages %}
> {% for message in loop_messages %}
> {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
Most of this has to be removed because it's irrelevant to the LLM unless we get a Jinja parser from some nice contributor.
We keep this <|start_header_id|>
as it states it is the starting header for the role.
We translate this + message['role'] +
into the role to be used for the template.
You will have to figure out what the role names used by this model are, but these are the common ones.
Sometimes the roles will be shown in the Jinja sometimes it won't.
We keep this <|end_header_id|>
We keep this \n\n
which translates into one new line (press enter) for each \n
you see. (two in this case)
Now we will translate message['content']
into the variable used by GPT4All.
%1
for user messages %2
for assistant replies<|eot_id|>
which indicates the end of whatever the role was doing. Now we have our "content" from this jinja block. {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
and we removed all the extra stuff.
From what I can tell GPT4all sends the BOS automatically and waits for the LLM to send the EOS in return.
content
. (not to be confused with message['content']
){% if loop.index0 == 0 %}
{% set content = bos_token + content %}
{% endif %}
{{ content }}
{% endfor %}
Finally, we get to the part that shows a role defined for the "assistant". The way it is written implies the other one above is for either a system or user role. (Probably both because it would simply show "user" if it wasn't dual purpose.)
This is left open ended for the model to generate from this point on forward. As we can see from its absence the LLM is expected to provide an eos
tag when it is done generating. Follow the same rules as we did above.
{% if add_generation_prompt %}
{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{% endif %}
This also provides us with an implied confirmation of how it should all look when it's done.
We will break this into two parts for GPT4All.
A System Prompt: (There is no variable you will just write what you want in it.)
<|start_header_id|>system<|end_header_id|>
YOUR CUSTOM SYSTEM PROMPT TEXT HERE<|eot_id|>
A Chat Template:
<|start_header_id|>user<|end_header_id|>
%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
%2
Tune in next time for the conclusion. (Why didn't it work? It looks like it's all good!)
Hint: You probably did it right and the model is not built properly. (I'll explain how to learn that too. After I eat.)
I'll have to conclude this tomorrow. Details for that will include. Where to look for these things. What to look for when you are looking at the files.
Thanks, would it help If I attach the .json files found along the model for you to see ? It looks to me they taken some personnal liberties with the tokens. I don't know I'm pretty new to the usage of GTP4All.
So basically if the model is not build correctly it's game over? However it seems that the collab they link do work.
Here is me saying hi :
I have finished explaining the advanced topics
Feature Request
Remove defaults for model templates.
Add GUI warnings that they have to configure this in order to use the model...
Link to Wiki documentation explaining how to configure any sideloaded or "discovered" model.