google / generative-ai-go

Go SDK for Google Generative AI
Apache License 2.0
547 stars 51 forks source link

CreateCachedContent - rpc error: code = InvalidArgument desc = Unsupported MIME type: text/plain; charset=utf-8 #198

Open majuansari opened 1 month ago

majuansari commented 1 month ago

Description of the bug:

Getting the error - rpc error: code = InvalidArgument desc = Unsupported MIME type: text/plain; charset=utf-8 for the following for CreateCachedContent

package main

import (
    "context"
    "fmt"
    "github.com/gabriel-vasile/mimetype"
    "google.golang.org/api/option"
    "log"
    "os"
    "time"

    "github.com/google/generative-ai-go/genai"
)

// uploadFile uploads the given file to the service, and returns a [genai.File]
// representing it. mimeType optionally specifies the MIME type of the data in
// the file; if set to "", the service will try to automatically determine the
// type from the data contents.
// To clean up the file, defer a client.DeleteFile(ctx, file.Name)
// call when a file is successfully returned. file.Name will be a uniqely
// generated string to identify the file on the service.
func uploadFile(ctx context.Context, client *genai.Client, path, mimeType string) (*genai.File, error) {
    osf, err := os.Open(path)

    // Determine the MIME type
    mtype, err := mimetype.DetectFile(path)
    fmt.Println("MIME type:", mtype.String())
    if err != nil {
        return nil, err
    }
    defer osf.Close()

    file, err := client.UploadFile(ctx, "", osf, nil)
    if err != nil {
        return nil, err
    }

    for file.State == genai.FileStateProcessing {
        log.Printf("processing %s", file.Name)
        time.Sleep(5 * time.Second)
        var err error
        file, err = client.GetFile(ctx, file.Name)
        if err != nil {
            return nil, err
        }
    }
    if file.State != genai.FileStateActive {
        return nil, fmt.Errorf("uploaded file has state %s, not active", file.State)
    }
    return file, nil
}

func main() {

    ctx := context.Background()
    client, err := genai.NewClient(ctx, option.WithAPIKey("API_KEY"))
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    file, err := uploadFile(ctx, client, "./test.txt", "")
    if err != nil {
        log.Fatal(err)
    }
    defer client.DeleteFile(ctx, file.Name)

    fd := genai.FileData{URI: file.URI}

    argcc := &genai.CachedContent{
        Model:             "gemini-1.5-flash-001",
        SystemInstruction: genai.NewUserContent(genai.Text("You are an expert analyzing transcripts.")),
        Contents:          []*genai.Content{genai.NewUserContent(fd)},
    }
    cc, err := client.CreateCachedContent(ctx, argcc)
    if err != nil {
        log.Fatal(err)
    }
    defer client.DeleteCachedContent(ctx, cc.Name)

    modelWithCache := client.GenerativeModelFromCachedContent(cc)
    prompt := "Please summarize this transcript"
    resp, err := modelWithCache.GenerateContent(ctx, genai.Text(prompt))
    if err != nil {
        log.Fatal(err)
    }

    printResponse(resp)

}

func printResponse(resp *genai.GenerateContentResponse) {
    for _, cand := range resp.Candidates {
        if cand.Content != nil {
            for _, part := range cand.Content.Parts {
                fmt.Println(part)
            }
        }
    }
    fmt.Println("---")
}

Actual vs expected behavior:

Error shouldn't be coming as this is part of the example

Any other information you'd like to share?

for same code if use pdf or images - getting the following error

rpc error: code = InvalidArgument desc = Cached content is too small. total_token_count=9, min_total_token_count=32768

jba commented 1 month ago

Reproduced. Looking into it.

jba commented 1 month ago

Set the mime type explicitly.

opts := &genai.UploadFileOptions{MIMEType: "text/plain"}
client.UploadFile(ctx, "", osf, opts)
jba commented 1 month ago

We'll update our example to match.

Regarding the other issue, that is a constraint on cached content: it has to be fairly large (or it's not worth caching it).

majuansari commented 1 month ago

We'll update our example to match.

Regarding the other issue, that is a constraint on cached content: it has to be fairly large (or it's not worth caching it).

Thanks working now.

Regarding the token count issue, i doubt it's due to the size. i gave a big pdf file as input.

jba commented 1 month ago

I'll look into that as well.

jba commented 1 month ago

Reopening for token count.

majuansari commented 1 month ago

@jba Thanks.

One more question. Can I use this package with vertex ai ?

jba commented 1 month ago

No, you have to use https://pkg.go.dev/cloud.google.com/go/vertexai/genai.