aallam / openai-kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.
MIT License
1.5k stars 180 forks source link

Add support for response_format type json_schema #377

Open craigmiller160 opened 3 months ago

craigmiller160 commented 3 months ago

Feature Description

OpenAI has added support for structured outputs in its API: https://openai.com/index/introducing-structured-outputs-in-the-api/

Ultimately, this means the existing ChatResponseFormat is inadequate. While I could use the class to declare a type of json_schema in the response format, ultimately I won't be able to actually provide the schema in this fashion.

Problem it Solves

Adding support for this OpenAPI feature.

Proposed Solution

If you don't have time to fully implement this, a simpler and more flexible option would be as follows:

  1. Make a parent interface for ChatResponseFormat that is implemented by the current class. Let's call it GenericChatResponseFormat, you can decide on a better name.
  2. Make ChatCompletionRequest.responseFormat be of this new type (GenericChatResponseFormat).
  3. Anyone can now implement their own response format.
  4. Your code just serializes the format and provides it to OpenAI in the request.

Obviously a full implementation of the structured output would be desirable, but this may be an alternative approach to quickly achieve the same thing.

Additional Context

N/A

alaegin commented 2 months ago

It would be nice to see this feature!

NotAWiz4rd commented 2 months ago

I can into the problem of not having it for a project of mine, so I forked it and added that capability: https://github.com/NotAWiz4rd/openai-kotlin

You can use it in your project by checking out the repo, building the openai-client-jvm and openai-core-jvm jars (assuming you are using JVM) and sticking those into your project. It might break the ability to specify the other formats (it works fine if you don't specify any responseFormat), so someone should do some testing (I don't have the time right now), which is why I'm putting it in here and not as a PR. Anyone should feel free to use that as the basis for a PR for this feature though.

topgun100 commented 2 months ago

@aallam thanks so much for your hard work. Please can you provide some guidance on how you want this feature implemented?

Please shout if you need any other support to maintain this valuable project.

takahirom commented 4 weeks ago

@aallam Thank you for maintaining this project. Making JSON responses based on schemas would be essential for most applications to integrate LLM data into their features. Could you please look into integrating this feature?

mixeden commented 3 weeks ago

+1 here

mixeden commented 3 weeks ago

for everyone concerned, for now you can pass "tools" and force a model to use a particular tool, then parse the arguments of the tool

cotulars commented 2 weeks ago

Temporary Solution for Structured Outputs in OpenAI Kotlin API

I made the temporary solution for this problem, check the code and try to use on your projects if you need it

Code solution

Here's a code what can be solve the problem, but it's ONLY TEMPORARY:

import com.aallam.openai.api.chat.*
import com.aallam.openai.api.model.ModelId
import com.aallam.openai.client.OpenAI
import io.ktor.client.request.*
import io.ktor.http.*
import io.ktor.http.content.*
import io.ktor.util.*
import kotlinx.serialization.EncodeDefault
import kotlinx.serialization.ExperimentalSerializationApi
import kotlinx.serialization.Serializable
import kotlinx.serialization.json.*
import com.aallam.openai.api.chat.ChatResponseFormat as ChatResponseFormat

@Serializable
data class _JsonSchema @OptIn(ExperimentalSerializationApi::class) constructor(
    val description: String = "",
    val name: String,
    val schema: JsonObject,
    @EncodeDefault val strict: Boolean = true
)

fun JsonSchema(schema: String): ChatResponseFormat = ChatResponseFormat(type = "json_schema:$schema")

val client: OpenAI = OpenAI(
    token = "" // add your token here
) {
    install("fieldInterceptor") {
        requestPipeline.intercept(HttpRequestPipeline.Transform) { body ->
            if (body is OutgoingContent.ByteArrayContent) {
                // Process and modify the body with custom JSON schema
                val bodyText = String(body.bytes())
                val json = Json { prettyPrint = true }
                val (format, scheme, jsonObject) = json.parseToJsonElement(bodyText).jsonObject["response_format"]
                    ?.jsonObject?.let {
                        val data = it["type"]?.jsonPrimitive?.content!!
                        if (data.startsWith("json_schema")) {
                            val list = data.split(":", limit = 2)
                            Triple(list[0], list[1], it)
                        } else Triple(data, "", it)
                    } ?: Triple("", "", JsonObject(emptyMap()))

                if (format == "json_schema") {
                    val schema = json.encodeToJsonElement(
                        _JsonSchema(
                            name = "schema",
                            schema = json.parseToJsonElement(scheme).jsonObject,
                            strict = true
                        )
                    )
                    val modifiedJsonObject = JsonObject(
                        jsonObject + ("json_schema" to schema) + ("type" to JsonPrimitive("json_schema"))
                    )
                    val newJsonStr = JsonObject(
                        json.parseToJsonElement(bodyText).jsonObject + ("response_format" to Json.encodeToJsonElement(modifiedJsonObject))
                    ).toString()
                    proceedWith(TextContent(newJsonStr, ContentType.Application.Json))
                } else proceedWith(bodyText)
            }
        }
    }
}

suspend fun main() {
    val completion = client.chatCompletion(
        ChatCompletionRequest(
            model = ModelId("gpt-4o-mini"),
            messages = listOf(
                ChatMessage(
                    role = ChatRole.System,
                    content = "You are a bot"
                ),
                ChatMessage(
                    role = ChatRole.User,
                    content = "What is the meaning of life?"
                )
            ),
            responseFormat = JsonSchema("""
                {
                    "type": "object",
                    "properties": {
                        "choices": {
                            "type": "array",
                            "items": {
                                "type": "object",
                                "properties": {
                                    "message": {
                                        "type": "object",
                                        "properties": {
                                            "content": {
                                                "type": "string"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            """.trimIndent())
        )
    )

    val response = completion.choices[0].message.content
    println(response)
}
chris-hatton commented 1 week ago

@mixeden Does this mean the functionality is effectively implemented via 'tools' - is it possible to elaborate and point to relevant API in this library? Thank you.

mixeden commented 1 week ago

@chris-hatton so you can pass one single "tool" in a list of tools and force a model to pick this tool (and reply according to its schema).

toolChoice = ToolChoice.Mode("required"), tools = listOf(Tool(function = yourFunction, type = ToolType.Function)),