softwaremill / sttp-openai

Apache License 2.0
41 stars 9 forks source link

Implement JSON schema support with apispec #217

Closed mrdziuban closed 2 days ago

mrdziuban commented 1 week ago

Fixes #216

@adamw re: your suggestion:

An extension to this idea would be to derive a Tapir schema from a case class, and then serialize it as JSON Schema.

I chose to use the Schema type from sttp-apispec to implement this, but I didn't add explicit support for passing a Tapir schema and having it be converted automatically -- I figured downstream users could do the conversion themselves if they wanted to.

OpenAI only supports a subset of JSON schema and has some limitations, which I described in a comment in ChatRequestBody.scala.

I don't think it's reasonable to enforce all of these constraints in this library, users will just have to be cognizant of whether their schema violates any of them. If it does, the API returns an error like this:

{
  "error": {
    "message": "Invalid schema for response_format 'schemaName': ... reason why schema was invalid ...",
    "type": "invalid_request_error",
    "param": "response_format",
    "code": null
  }
}
mrdziuban commented 1 week ago

Ah I hadn't seen #205, this definitely overlaps with that. I think implementing the support with apispec is more robust than just a ujson.Value as in #205, and it leaves less room for error. My changes don't add strict to FunctionTool like #205 does though

adamw commented 4 days ago

This looks great! Could you add an example to the readme (also maybe mentioning how to auto-derive the schema using tapir)? That would help a lot when it comes to discoverability of the feature. Alternatively, there's the examples sub-project, to which you simply link from the readme

mrdziuban commented 4 days ago

Definitely, updated the readme to show both manual and derived construction of JSON schemas.

I also discovered and fixed a bug while testing schema derivation where existing required and additionalProperties fields on the schema took precedence over the ones we add ourselves, which resulted in invalid schema errors from the API.

adamw commented 2 days ago

Awesome - thanks! :)