Is there an article/paper describing the thoughts on this approach

jens-create commented 7 months ago

Dear @musabgultekin and other others :)

This repo is super awesome and I am currently benchmarking the functionary model on some medical benchmarks. Is there a paper that is related to this work or another place where I can find the thoughts and considerations made when implementing this framework. Here I am thinking about how functions are parsed to the model. Why there is two messages with role==system etc.

Have you drawn inspiration from OpenAI, and if yes do you have any sources for how OpenAI have implemented functions. I haven't found any official sources but your way of parsing functions as typescript functions seems rather similar to OpenAI.

Looking forward to your response!

Best regards, Jens

musabgultekin commented 7 months ago

Hi!, Thank you!

The main inspiration about the typescript format came to my mind when thinking about the pre-training step. If you think about pre-training step, you'll see that it saw lots of tokens related to TypeScript types. And also after seeing https://github.com/microsoft/TypeChat/blob/d2f2de9ca37ef9adeb108d5fc60703b72fec0a22/site/src/blog/introducing-typechat.md#just-add-types , that validated my initial idea and after testing it carefully, it turned out that it worked properly.

There are two system messages, one is the function definitions, and one for the hard-coded system message (Telling the model to use the functions when necessary). We're currently considering removing the hardcoded message and only keeping the function definitions, to reduce the tokens.

As you may already know, OpenAI is not publishing their research anymore. So unless an OpenAI engineer talks, we cannot know what they do. And that's totally fine! Cause this happened throughout the history, similar discoveries came from different people at the same time of the history.

Currently we don't have a paper, but why not ? :)

jens-create commented 7 months ago

Hi Musab,

Thanks for the thorough reply - cool to hear where your inspiration came from.

I have a follow-up question: Are you planning to release your synthetic dataset that you've generated using llama2 70b? How have you evaluated llama2 70b performance on generating the synthetic data - i.e. what is the zero-shot performance of the model in terms of valid function parameters (step 2 in data preparation on readme file)?

In regards to OpenAI and how they do it

This OpenAI forum post claims to have found a way to reveal the system message of the GPTs. https://community.openai.com/t/magic-words-can-reveal-all-of-prompts-of-the-gpts/496771

E.g. I have a function that in OpenAI is translated to:

namespace functions {

// Answer the multiple choice question with the given options.
type QuestionAnswer = (_: {
// Explanation of the answer option chosen.
explanation: string,
// Therefore, among A through D, the answer is
answer: ("A" | "B" | "C" | "D"),
}) => any;

} // namespace functions

And in functionary

// Supported function definitions that should be called when necessary.
namespace functions {

// Answer the multiple choice question with the given options.
type QuestionAnswer = (_: {
// Explanation of the answer option chosen.
explanation: string,
// Therefore, among A through D, the answer is.
answer: any,
}) => any;

} // namespace functions

To me it was quite remarkable how similar the prompt is! I have used the generate_schema_from_functions in https://github.com/MeetKai/functionary/blob/17a86de9b06acaedd0afab212717205c0484a218/schema.py#L54 to create the above prompt.

Best regards, Jens

musabgultekin commented 7 months ago

Hi, thanks for your input.

We have not evaluated the function generation part, as it constantly changes and its rather a completely a manual process. But For example, I've found that its also possible to do it with other pretrained models, e.g Falcon, Mistral. We plan not to share the dataset.

Regarding the function definition: Please checkout the .d.ts typescript definition files: https://github.com/search?q=namespace+path%3A*.d.ts&type=code All we do is removing the declare and interface from namespace part of a Typescript definition file format. And because functions are passing a JSON object, all of them are wrapped with curly braces.

musabgultekin commented 7 months ago

I've added this more descriptive explanation to the README: https://github.com/MeetKai/functionary/commit/cc868177ea87131e4ac34c901a21217ea1578feb

MeetKai / functionary

Is there an article/paper describing the thoughts on this approach #57

In regards to OpenAI and how they do it