nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
69.91k stars 7.65k forks source link

[Feature] Remove defaults for model templates and system prompt #2763

Open 3Simplex opened 2 months ago

3Simplex commented 2 months ago

Feature Request

Remove defaults for model templates.

Add GUI warnings that they have to configure this in order to use the model...
Link to Wiki documentation explaining how to configure any sideloaded or "discovered" model.

manyoso commented 2 months ago

The idea here is that the defaults we have for system prompt and chat template are in large part detrimental. Very few models will work well with these defaults.

Instead of parsing gguf files in order to install a new model and deduce the proper templates from that perhaps we should just require the user to fill out these templates by hand. If the templates aren't filled out, then the model might be 'installed' but it should not be 'enabled' or so on.

3Simplex commented 2 months ago

I wrote this for the situation.
https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models

I believe this title is now concise and can be expected to remain unchanged.

builder-main commented 2 months ago

@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed.

Example of a "chat template" that might be found

"{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}

The pseudo code

{% set loop_messages = messages %}
{% for message in loop_messages %}
    {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
    {% if loop.index0 == 0 %}
        {% set content = bos_token + content %}
    {% endif %}
    {{ content }}
{% endfor %}
{% if add_generation_prompt %}
    {{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{% endif %}

The prompt template I understand

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

%2

Yet it fails miserabily with answer prompt looping.

3Simplex commented 2 months ago

@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed. ...paraphrasing... Yet it fails miserably with answer prompt looping.

I see you are looking at a Jinja template. Thanks for reminding me, I just included that as an "Advanced Topic" along with how to make sure a model will work if it was not built correctly. image

I can attempt to answer both of these questions here before I add them to the Wiki. (looks like you did well decoding the template)

Breaking down a Jinja template is fairly straight forward if you can follow a few rules.

You must keep the tokens as written in the jinja and strip out all of the other syntax etc. Also try to watch for mistakes here. Sometimes they fail to input a functional jinja template. The Jinja must have the following tokens:

Most of this has to be removed because it's irrelevant to the LLM unless we get a Jinja parser from some nice contributor.
We keep this <|start_header_id|> as it states it is the starting header for the role.
We translate this + message['role'] + into the role to be used for the template.

You will have to figure out what the role names used by this model are, but these are the common ones.
Sometimes the roles will be shown in the Jinja sometimes it won't.

We keep this <|end_header_id|>
We keep this \n\n which translates into one new line (press enter) for each \n you see. (two in this case)
Now we will translate message['content'] into the variable used by GPT4All.

Now we have our "content" from this jinja block. {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %} and we removed all the extra stuff.

From what I can tell GPT4all sends the BOS automatically and waits for the LLM to send the EOS in return.

We will break this into two parts for GPT4All.
A System Prompt: (There is no variable you will just write what you want in it.)

<|start_header_id|>system<|end_header_id|>

YOUR CUSTOM SYSTEM PROMPT TEXT HERE<|eot_id|>

A Chat Template:

<|start_header_id|>user<|end_header_id|>

%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

%2

Tune in next time for the conclusion. (Why didn't it work? It looks like it's all good!)
Hint: You probably did it right and the model is not built properly. (I'll explain how to learn that too. After I eat.)

3Simplex commented 2 months ago

I'll have to conclude this tomorrow. Details for that will include. Where to look for these things. What to look for when you are looking at the files.

builder-main commented 2 months ago

Thanks, would it help If I attach the .json files found along the model for you to see ? It looks to me they taken some personnal liberties with the tokens. I don't know I'm pretty new to the usage of GTP4All.

So basically if the model is not build correctly it's game over? However it seems that the collab they link do work.

Here is me saying hi : image

3Simplex commented 2 months ago

I have finished explaining the advanced topics