sashabaranov / go-openai

OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go
Apache License 2.0
8.7k stars 1.32k forks source link

Can I upload pictures and request gpt-4o now? #785

Open DanDanDD opened 1 week ago

DanDanDD commented 1 week ago

Using the CreateChatCompletion method, add the multiContent parameter to request gpt-4o. There will be problems. Can anyone provide a successful example?

mathisen99 commented 1 week ago

Is this implented yet ? i spend an hour trying to get this to work bu not luck. i assume vision still is broken ? this what i tried with no luck.

func OpenAIRequest(message, imageURL, target string) (string, error) {
    color.Cyan(">> OpenAIRequest called with message: %s, imageURL: %s, target: %s", message, imageURL, target)

    client, ctx, err := InitializeClient()
    if err != nil {
        color.Red(">> Error initializing OpenAI client: %v", err)
        return "", err
    }

    // Prepare the user message content
    userMessageContent := []interface{}{
        map[string]interface{}{
            "type": "text",
            "text": message,
        },
    }

    if imageURL != "" {
        color.Cyan(">> Adding image URL to message content: %s", imageURL)
        userMessageContent = append(userMessageContent, map[string]interface{}{
            "type": "image_url",
            "image_url": map[string]string{
                "url": imageURL,
            },
        })
    }

    userMessageContentJSON, err := json.Marshal(userMessageContent)
    if err != nil {
        color.Red(">> JSON marshal error: %v", err)
        return "", fmt.Errorf("JSON marshal error: %v", err)
    }

    userMessage := openai.ChatCompletionMessage{
        Role:    openai.ChatMessageRoleUser,
        Content: string(userMessageContentJSON),
    }

    req := openai.ChatCompletionRequest{
        Model:    "gpt-4o",
        Messages: []openai.ChatCompletionMessage{userMessage},
    }

    // Debug print the request
    color.Cyan(">> Sending request to OpenAI: %+v", req)
    resp, err := client.CreateChatCompletion(ctx, req)
    if err != nil {
        color.Red(">> ChatCompletion error: %v", err)
        return "", fmt.Errorf("ChatCompletion error: %v", err)
    }
    color.Cyan(">> Received response from OpenAI: %+v", resp)

    return ProcessResponse(ctx, client, &resp, req)
}
lucioreyli commented 6 days ago

@mathisen99 , your current payload look like this in JSON:

{
      "role": "user",
      // wrong content type
      "content": "[{\"text\":\"Who is this character?\",\"type\":\"text\"},{\"image_url\":{\"url\":\"https://i.pinimg.com/originals/db/e8/c1/dbe8c1a11d5b954fa952f86928c16898.jpg\"},\"type\":\"image_url\"}]"
}

To solve this, you can do like this:

messages: = [] openai.ChatCompletionMessage {
    {
      Role: openai.ChatMessageRoleUser,
      MultiContent: [] openai.ChatMessagePart {
        {
          Type: openai.ChatMessagePartTypeText,
          Text: "Who is this character?",
        }, {
          Type: openai.ChatMessagePartTypeImageURL,
          ImageURL: & openai.ChatMessageImageURL {
            URL: "https://i.pinimg.com/564x/47/c5/ea/47c5eadf9ef2e17f755fabd059fd1a16.jpg", // or Base64 encoded image
            Detail: openai.ImageURLDetailAuto,
          },
        },
      },
    }

CleanShot 2024-07-14 at 17 42 36

mathisen99 commented 6 days ago

Thanks 🙏

JackYinpei commented 1 day ago

Thanks, It works