64bit / async-openai

Rust library for OpenAI
https://docs.rs/async-openai
MIT License
1.11k stars 165 forks source link

Embedding deserialization error when using encoding_format `EncodingFormat::Base64` #189

Closed adri1wald closed 6 months ago

adri1wald commented 7 months ago

OpenAI returns embeddings in base64 string representation (offers better compactness than a JSON array) when specifying EncodingFormat::Base64. However, this is not handled in deserialization logic of async-openai.

Imo, Base64 should also be the default since much less data is transferred (as is the case in the python client).

Snippet:

async fn embed(openai: &Client<OpenAIConfig>, text: String) -> Result<Embedding> {
    let input: String = text.into();
    let request = CreateEmbeddingRequestArgs::default()
        .model("text-embedding-3-small")
        .input(input)
        .encoding_format(EncodingFormat::Base64)
        .build()
        .context("OpenAI embedder: failed to build text embedding request")?;
    let mut response = openai
        .embeddings()
        .create(request)
        .await
        .context("OpenAI embedder: failed to get text embedding")?;
    if response.data.len() != 1 {
        anyhow::bail!("Expected 1 embedding, got {}.", response.data.len());
    }
    Ok(response.data.remove(0).embedding.into())
}