mistralai / mistral-inference

Official inference library for Mistral models
https://mistral.ai/
Apache License 2.0
9.16k stars 804 forks source link

Tokenizer skips the special tokens while decoding #162

Open anandsarth opened 1 month ago

anandsarth commented 1 month ago

Now the mistral-7B-v3 has the tool support there are special tokens like [TOOL_CALL] [TOOL_RESULT]. But when I decode the output from the results the special tokens are not present and there is no argument while decode like in huggingface for skip_speical_tokens=False Therefore I am not about to know if the output is tool call or standard response. How can I decode the response from the output tokens

output_tokens = [5,1501, 7567,1629,2032,1113] # here token 5 is the special token about the tool call

#after decoding I get 
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens)

# result [{"name": "get_current_weather", "arguments": {"location": "San Francisco, CA", "format": "celsius"}}]
#therefore I can't tell it is tool call
carlesonielfa commented 1 day ago

Experiencing the same issue here, did you manage to solve it?