Inconsistent prompting schema

rizerphe commented 1 year ago

The prompt example in the documentation is as follows:

system:
namespace weather {

// Get the current weather
type get_current_weather  = (_: {
// The city and state, e.g. San Francisco, CA
location: string,
// The temperature unit to use. Infer this from the users location.
format: "celsius" | "fahrenheit",
}) => any;

} // namespace weather
system:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary
user:
</s>What is the weather in Istanbul?</s>
assistant

However, when de-tokenizing the output of prepare_messages_for_inference I get:

 system:
// Supported function definitions that should be called when necessary.
namespace functions {

// Get the current weather
type get_current_weather = (_: {
// The city and state, e.g. San Francisco, CA
location: string,
// The temperature unit to use. Infer this from the users location.
format: "farehnheit" | "celsius",
}) => any;

} // namespace functions
 system:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary
 user:
</s>What is the weather in Istanbul?
 assistant

Crucially, there are spaces before the role names in the generated prompt, the example from the README doesn't include Supported function definitions that should be called when necessary., and the generated user message does not end with </s>. Can you confirm which of these prompting schemes was used in training?

Code used to generate the example:

```python tokenizer = LlamaTokenizer.from_pretrained("musabgultekin/functionary-7b-v1") print( tokenizer.decode( prepare_messages_for_inference( tokenizer=tokenizer, messages=[ ChatMessage(role="user", content="What is the weather in Istanbul?"), ], functions=[ Function( name="get_current_weather", description="Get the current weather", parameters={ "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", }, "format": { "type": "string", "enum": ["farehnheit", "celsius"], "description": "The temperature unit to use. Infer this from the users location.", }, }, "required": ["location", "format"], }, ) ], )[0] ) ) ```

musabgultekin commented 1 year ago

Hi, thanks for pointing this out. It might be due to this https://github.com/MeetKai/functionary/commit/3fa515ac29e5c9c409eb7faabb88a04f29524946 commit. In that case, it looks like it has introduced a bug.

This is the exact original code that has been used for training: https://github.com/MeetKai/functionary/blob/0dbf25b6c2e011822cb4e0158ad8598a32f0e4c2/train/train.py#L120

Can you try decoding that? It should be same as the docs. There shouldn't be space.

Also, I've just added, "Supported function definitions that should be called when necessary." to the README.md's explanation. Thanks for the heads up. That is something I didn't use on training and not strictly required. But its somewhat useful if you use openai's "functions" parameter, cause "functions" parameter doesnt have a generic description field, meanwhile OpenAPI has. So If you're using OpenAPI, then you can take the description field and put it over there (as I do on generate_schema_from_openapi function), but when there is no description (as in functions parameter), we use that sentence. I've just made up that description with 15 seconds of thought, you can emit it or change it as you wish (model is not trained like that).

musabgultekin commented 1 year ago

Oh wait. in terms of tokenization, there is a behaviour difference between fast tokenizer and slow tokenizer. can you try with use_fast=False on tokenizer params. Like here: https://github.com/MeetKai/functionary/blob/d20c7709b6dafdd074cf19ae658099464a5a0dfe/train/train.py#L226

rizerphe commented 1 year ago

Thank you! With the slow tokenizer, here's the result it gives:

 system:
// Supported function definitions that should be called when necessary.
namespace functions {

// Get the current weather
type get_current_weather = (_: {
// The city and state, e.g. San Francisco, CA
location: string,
// The temperature unit to use. Infer this from the users location.
format: "farehnheit" | "celsius",
}) => any;

} // namespace functions
 system:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary.
 user:
</s>What is the weather in Istanbul?
 assistant

It seems identical to that with the fast tokenizer, but it is still inconsistent with the documentation.

Also, a somewhat unrelated question: Is there any motivation behind departing from the existing llama2 prompting schema? Because as far as I can tell, your schema makes it more challenging to separate assistant messages because they don't end with the endoftext token, and fine-tuning takes up more effort, no?

musabgultekin commented 1 year ago

Okay I think I know now. We have been tokenizing the messages seperately. Rather than combining all texts and then tokenizing. so that's the problem probably. So tokenization should be done like here: https://github.com/MeetKai/functionary/blob/0dbf25b6c2e011822cb4e0158ad8598a32f0e4c2/train/train.py#L120 . (tokenize each message seperately)

In terms of the prompting schema, because this is a seperate model trained from scratch and not trained on "-chat" models. Which means we can use any kind of prompting schema.

The reason that I used this schema is to make the model possible to predict whether it should output a function or regular text. This was somewhat not easy on the original format. Also I needed to make it stop with stop tokens on function calls. So custom format was needed. To be honest, l don't perfectly like the current formatting system. I would prefer custom tokens to seperate messages, and custom tokens to seperate function call endings. But this was the most simple thing that I could have done without introducing a new token. But, one can of course modify the training code train from scratch.

rizerphe commented 1 year ago

I'm trying to finetune the chat model to do this task (with LoRA), following their prompting schema with the modification to allow for function calling. I am reasonably successful; they don't seem that incompatible, it works somewhat well

musabgultekin commented 7 months ago

Hi, we changed the prompting schema to be much more flexible with our v2 models. And our latest model beats gpt3.5 on our internal evaluation metrics.:

I know its been a long time sorry :(

MeetKai / functionary

Inconsistent prompting schema #26