Closed srossi93 closed 3 months ago
Hi @srossi93 👋
Thanks for bringing this up. I think this a valid concern, and the even though the workaround is OK, implementing the chat template might be the real solution here.
I'll loop in @drbh to this conversation as well.
In the meantime, could you point which examples in the docs are incomplete? Even better if you have time to make a PR with corrections 🙌
Hi @ErikKaum,
Thanks for the reply. I would wait to check this before updating the examples (I might be able to do it). BTW, can you point me where in the server (or in the router) the chat template is used?
Sounds good 👍
prepare_chat_input
function you'll find where everything is handled https://github.com/huggingface/text-generation-inference/blob/6e127dcc9616a0e39efd95a3e08324138e8c4df7/router/src/server.rs#L1139@ErikKaum @drbh
Ok, I think I actually found the problem. The chunk message with the tool available was never actually appended to the last conversation turn. The change is minimal, I opened a PR to fix this (#2395).
With the fix, this is the new prompt:
2024-08-10T15:50:13.473242Z DEBUG chat_completions: text_generation_router::server: router/src/server.rs:296: Input: <s>[INST] What's the status of my transaction T1001?
---
{"$functions":{"notify_error":{"properties":{"_name":{"const":"notify_error","type":"string"},"error":{"description":"The error or issue to notify","type":"string"}},"required":["error","_name"],"type":"object"},"retrieve_payment_date":{"description":"Get payment date of a transaction","properties":{"_name":{"const":"retrieve_payment_date","type":"string"},"transaction_id":{"description":"The transaction id.","type":"string"}},"required":["transaction_id","_name"],"type":"object"},"retrieve_payment_status":{"description":"Get payment status of a transaction","properties":{"_name":{"const":"retrieve_payment_status","type":"string"},"transaction_id":{"description":"The transaction id.","type":"string"}},"required":["transaction_id","_name"],"type":"object"}},"properties":{"function":{"anyOf":[{"$ref":"#/$functions/retrieve_payment_status"},{"$ref":"#/$functions/retrieve_payment_date"},{"$ref":"#/$functions/notify_error"}]}}}[/INST]
Now, I still think this is a temporary solution, in the sense that it still doesn't follow the prompt template with tools. But at least, the functionality is implemented correctly now.
Please, see if someone can review and merge the PR.
Thanks a lot for finding the problem & opening a PR 🙌
I think we'll add some test + make sure CI is green, other than that I think this should be good to go 👍
Closing, as fix is merged.
System Info
Information
Tasks
Reproduction
I'm facing a problem regarding with tools and TGI. Unless I'm mistaken, the tools are never passed in the input to the model, degrading completely the accuracy and the quality of the responses. The tools are used to set the grammar but this is not enough, as they have to be put in the prompt as well with a particular format (see for example, the chat template of mistral-7b here)
Replicate the problem
Start a server with Mistral-7b-v3 (with debug log)
And try to run
The output will be similar to:
which has wrong function name. This is an example from Mistral documentation, so I tend to not blame the model for this (I also doubled checked without TGI and everything works as expected).
Looking at the server log, you can check the formatted input
It looks like,
tools
are not passed in the prompt, which is a problem.Workaround
It's possible to add a system prompt with the tools mimicking the chat template
which formats to
This is not following exactly Mistral's chat template but it's an improvement (and now the tool call is correct).
But this should not be the default behavior, I'm expecting that TGI respects the chat templates, including the tools format.
I also checked if changing the
tools_prompt
could be a solution, but also this argument has no effect of the final prompt (and on the final output).The alternatives are:
I've seen a couple of issues that might be related to this (e.g. #2310, #2240)
BTW, some of the examples in your documentation are also broken: the format is correct but the name/arguments of the tool calls are not.
Expected behavior
See above