aallam / openai-kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabilities.
MIT License
1.49k stars 179 forks source link

Question: GPT4-vision #309

Closed The-Michael-Chen closed 7 months ago

The-Michael-Chen commented 8 months ago

Hey! How would I use gpt4-vision with this library? I'd like to pass in an image and text asking about the image and get the response.

The-Michael-Chen commented 8 months ago

I believe I figured it out! Was this in the documentation anywhere? I had trouble finding it. I had to dig through the commit history to find this. val request = chatCompletionRequest { model = ModelId("gpt-4-vision-preview") messages { user { content { text("What’s in this image?") image("https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg") } } } maxTokens = 300 }

The-Michael-Chen commented 8 months ago

Also, is there a way to pass in a local image instead of the url?

BonnenuIt commented 8 months ago

Same question. The authors provide some codes in openai-client/src/jvmTest/kotlin/com/aallam/openai/client/TestChatVisionJVM.kt. But I don't know how to make a ChatMessage list in this way.

BonnenuIt commented 8 months ago
        val reqList: ArrayList<ContentPart> = ArrayList<ContentPart>()
        reqList.add(TextPart("Hello! Describe the image for me."))
        reqList.add(ImagePart("data:image/jpeg;base64,$picInBase64"))
        val chatCompletionRequest = ChatCompletionRequest(
            model = ModelId("gpt-4-vision-preview"),
            messages = listOf(
                ChatMessage(
                    role = ChatRole.System,
                    content = "You are a helpful assistant!"
                ),
                ChatMessage(
                    role = ChatRole.User,
                    content = reqList
                )
            )
        )

↑ works.