Open dave-gray101 opened 1 year ago
I have no opinion on how you do it, but I'm really appreciative of the fact that you'd like to add OpenAPI support. It's also something that I would very much like to see. Figuring out the API, particularly as it changes over time, isn't easy! Thank-you!
As I've mentioned in discord, I've been prototyping around the idea of generating request and response models automatically from the "real" OpenAI API specification that they make available at this repo. The primary complication that's immediately apparent with this idea is that our local models and backends support additional configuration parameters. Therefore, I propose the following: that we change our API signatures for LocalAI to match OpenAI exactly, with one extension - our requests will have an additional optional parameter of
x-LocalAI-extensions
. This will be a structure specific to each endpoint that contains the applicable configuration points.Some of my earlier experiments that I mentioned in discord involved generating only the relevant datastructures, and extending them to create our own models. This was appealing as it had the fewest runtime dependencies, but I was unsatisfied by the json-handling experience - and
oapi-codegen
generates parsing code if you allow it to!Therefore, I tossed out that prototype and created a fresh one taking a cleaner approach: branch
The difference in this branch is that I've flipped the order of things: the custom LocalAI extension parameters are defined as a patch to the OpenAI specification before code generation so that our additional parameters are included. To avoid this becoming a maintenance nightmare, this is not a binary or text patch, but rather uses
ytt
to be a bit more aware of the YAML structure. I've got a starting point in the form of https://github.com/dave-gray101/LocalAI/blob/openai-openapi/openai-openapi/localai_model_patches.yaml, but this needs a lot of prettying up before this feature is done!In addition to the parsing benefits, this also allows us to generate server stubs, to ensure we at least respond on all the endpoints, even if that response is a
501 Not Implemented
.I'm coming up for air here to start a discussion on this as there's really three ways I see to proceed from here:
OpenAIRequest
within LocalAI itself, we could attempt to transform the endpoint-specific models to this structure at the api layer. Personally, I am not a big fan of this solution, but I want to list it for completeness - it's potentially less invasive than my preference below.OpenAIRequest
struct - I would rather have LocalAI's code handle seperate request models for the different endpoints, as they have such radically different parameters. We're well positioned to do this as thepredictions.go
file is a pretty good abstraction layer already. The main issue to discuss with this option surrounds the config files - currently, they have a pile of loose properties and an additionalparameters
object. I'd like to discuss some options here for potentially mode suitable structures. Due to the fact that models are sometimes used for multiple endpoints like embeddings or chat vs completion, I'm torn on if a better structure is to have a specific config file for each combination (something likeconfig/chat/gpt-3.5-tubo.yaml
being a common configuration to simulate chatGPT), or having a single config file per model, containing a mapping of supported endpoints to default request options. In either case, I propose removing most options from the "general" sections of the config to the aformentionedx-LocalAI-Extensions
struct and therefore, the endpoint-specific default request options to use if the JSON on the request doesn't specify an override. The other main endpoint-specific detail that isn't a part of the request at all is the template to use but is there anything else in there worth specific consideration?I plan to keep working on this in the direction of option 3 for now. Opening this up for discussion before I get too much farther! Sorry this post became a bit rambly - I just wanted to track it somewhere other than Discord.