chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
13.48k stars 1.14k forks source link

[Bug]: Get vertex embeddings when using GoogleVertexEmbeddingFunction #1995

Open pig7788 opened 2 months ago

pig7788 commented 2 months ago

What happened?

When I call GoogleVertexEmbeddingFunction to get embeddings, it can't parse response appropriately. code before fixed:

if "predictions" in response:
     embeddings.append(response["predictions"]["embeddings"]["values"])

when I print my response, it is like:

{
    'predictions': [{
        'embeddings': {
            'values': [ embedding values, ... ],
            'statistics': {
                'truncated': False,
                'token_count': 179
            }
        }
    }],
    'metadata': {
        'billableCharacterCount': 206
    }
}

The key 'predictions' corresponds to the value type which is list object. So I guess the response which has changed from google vertex api.

After the code fixed like this:

if "predictions" in response:
     for prediction in response['predictions']:
           embeddings.append(prediction["embeddings"]["values"])

It runs well.

Versions

Chromadb 0.4.24 Python 3.10.13 MacOS 14.1

Relevant log output

File "/Users/liyuxiang/.pyenv/versions/butterbeer/lib/python3.10/site-packages/chromadb/utils/embedding_functions.py", line 668, in __call__
    embeddings.append(response["predictions"]["embeddings"]["values"])
TypeError: list indices must be integers or slices, not str
nicolasgere commented 2 months ago

Thank you for the report, would be awesome if you could open a pr with your code change. If not, let me know, I would do it.

pig7788 commented 2 months ago

I finish the pr after my code changed. Please help me check up, thank you!