Open woheller69 opened 2 weeks ago
Great patch I would check all to see if anyone is missing a typo or newline.
Thank u
it seems that keep missing some newlines in there you wanna add in llama3 in this PR? It would be like this the correct one:
llama_3_prompt_markers = {
Roles.system: PromptMarkers("""<|start_header_id|>system<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.user: PromptMarkers("""<|start_header_id|>user<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.assistant: PromptMarkers("""<|start_header_id|>assistant<|end_header_id|>\n\n""", """<|eot_id|>"""),
Roles.tool: PromptMarkers("""<|start_header_id|>function_calling_results<|end_header_id|>\n""", """<|eot_id|>"""),
}
I don't know if in tools would be another \n
newline.. I'm not sure
On Gemma
I'm not sure is required a new line I have to test it
And on Phi
you are correct :)
I saw the issue you put it in Gemma
Thanks.
Just if wanna add the missing Llama 3 would be awesome
Thank u for the contribution
I don't use tools, so I don't know if another \n is missing there. But Llama3 is also fixed in this PR.
But I think a better long term fix in general would be my proposal in #54, i.e. to keep the leading and trailing newlines (and spaces?) produced by the model and store them in history. I don't know the code (and I am a python noob) but it feels like they are being stipped.
it seems that keep missing some newlines in there you wanna add in llama3 in this PR? It would be like this the correct one:
llama_3_prompt_markers = { Roles.system: PromptMarkers("""<|start_header_id|>system<|end_header_id|>\n\n""", """<|eot_id|>"""), Roles.user: PromptMarkers("""<|start_header_id|>user<|end_header_id|>\n\n""", """<|eot_id|>"""), Roles.assistant: PromptMarkers("""<|start_header_id|>assistant<|end_header_id|>\n\n""", """<|eot_id|>"""), Roles.tool: PromptMarkers("""<|start_header_id|>function_calling_results<|end_header_id|>\n""", """<|eot_id|>"""), }
We only need one more \n in Roles.assistant, not in Roles.user (there it should not matter) We only need to make sure that we append the latest model answer exactly as the model produced it so that it finds the tokens already processed in memory.
Is there any reason, not to merge this?
Gemma needs another \n in the prompt template to avoid slow processing of follow up prompts, see #54
The longer the models previous answer the more relevant...
Probably there are similar issues in all other prompt templates.