Open willkurt opened 4 months ago
The "Python types" interface looks really clean and simple. However, it seems to me that using this approach to handle "complex" function schemas (e.g., functions with parameters which are objects) is not easy. I wish to hear more wisdom from you!
I should have a blog post out covering the details the implementation (and refactoring it a bit) reasonably soon which hopefully will clear up the implementation details. There are a couple of cases of functions that take objects/dictionaries as arguments and this is solved by just recursively building the regex (though I haven't tested out the depth this can reasonably go).
Another potential challenge is multi-function calling where the model is offered multiple functions and can potentially choose more than one. I think this should be fairly reasonable to solve, but I could be mistaken.
One larger concern I have on generalizing this is related to https://github.com/outlines-dev/outlines/issues/658 where sufficiently a complex regex can eat up a lot of memory.
As we get closer to being ready to implement this I'll try to put together some example cases we can use to help make sure everything is working under realistic scenarios.
Function Calling for Any Open Model
Based on the work in the post Beating GPT-4 with Open Models where we implement function calling for a variety of open models using Outlines, it should be possible to generalize this into a generic method for applying function calling to any open model. This could be a huge win for both Outlines and open models in general since it creates a universal function calling interface regardless of the FC specific fine tuning (as an aside, it looks like FC via structured generation works better than FC via finetuning).
Implementation details can be found in this gorilla/BFCL fork but I'll be doing a more detailed write up soon and cleaning up that code a bit. Currently I'm only supporting single function calling, but this should be fairly straight forward to extend to multiple functions.
Interface
I propose we provide (at least) two major interfaces for function calling applied to arbitrary models.
Basic Implementation
For reference, we currently are able to transform a function in the BFCL format like this:
Into a regular expression that constrains the model output to produce a function call similar to this:
[calculate_triangle_area(base=10, height=5)]
Using this function to build the regex for this particular function call (full details can be found here):
Ideally we would change this interface quite a bit, but this is just to give a sense of the basic logic behind implementing-function-calling-as-structured-generation
Lots more to add to this, but wanted to get some notes down while they're still in my head.