CsabaConsulting / InspectorGadgetApp

Open Multi-Modal Personal Assistant
MIT License
3 stars 1 forks source link

RAG: Upgrade text embedding from text-embedding-004 to text-multilingual-embedding-002 #48

Open MrCsabaToth opened 2 weeks ago

MrCsabaToth commented 2 weeks ago

CHANDRA got me thinking about the new text-embedding-preview-0815 model to upgrade from text-embedding-004. However https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/generative_ai/text_embedding_new_api.ipynb shows that:

Python and Java code right now (along with the CODE_RETRIEVAL_QUERY new task type text-embedding-preview-0815 introducing) is not a significant purpose of the app as of now, even though users may screenshot a computer screen with code and ask about that. The current multimodal embedding has severe limitations, we'll probably go with transcribing images for vector indexing. The database models are already prepared for that.

So both text-embedding-preview-0815 and text-embedding-004 models turn out to be English only. To support international use I decided to try the text-multilingual-embedding-002, hoping that soon there will be new versions of that as well. Also note that with the introduction of the new dimensionality folding #47 we'll control the vector size regardless of the embedding model's output vector length.

curl -X POST
     -H "Authorization: Bearer $(gcloud auth print-access-token)"
     -H "Content-Type: application/json; charset=utf-8"
     -d @multirequest.json
     "https://us-central1-aiplatform.googleapis.com/v1/projects/duet-ai-roadshow-415022/locations/us-central1/publishers/google/models/text-multilingual-embedding-002:predict" > multiresult.json

multirequest.json:

{
  "instances": [
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Chat request",
      "content": "I would like embeddings for this text!"
    },
    {
      "task_type": "RETRIEVAL_DOCUMENT",
      "title": "Beszélgetés kérelem",
      "content": "Ehhez a szöveghez szeretnék beágyazásokat!"
    }
  ]
}

multiresult.json

MrCsabaToth commented 2 weeks ago

Quick cursory look at the multiresult.json shows that the two vectors (English and Hungarian equivalent) are very close to each other: the signs of the elements match and the values are mostly very close. Dimensionality of the multi language model is 768 just like the newer English models.

MrCsabaToth commented 1 week ago

There's a problem: models/text-multilingual-embedding-002 is not found for API version v1beta, or is not supported for embedContent. Call ListModels to see the list of available models and their supported methods.

Apparently https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api is Vertex AI and via Gemini API only text-embedding-004 and embedding-001 is mentioned https://ai.google.dev/gemini-api/docs/models/gemini#text-embedding

MrCsabaToth commented 1 week ago

I tested and unfortunately text-embedding-004 doesn't seem to produce close vectors for English vs Hungarian sentence like the text-multilingual-embedding-002 did. I filed an issue: https://github.com/google-gemini/generative-ai-dart/issues/209

MrCsabaToth commented 1 week ago

Looks like we can use the Firebase Vertex AI Flutter package (https://pub.dev/packages/firebase_vertexai/example) for the multilingual embedding: https://firebase.google.com/docs/vertex-ai/gemini-models#input-output-comparison.

If we can transition to Firebase we might be able to eliminate the Gemini API Key in favor of Firebase? (I still want to avoid login if possible though, so maybe we'll use a close function - we already have a pair for STT and TTS).

However the Flutter Firebase Vertex AI package indicates it doesn't support function calling and structured output for Gemini 1.5 Flash? https://firebase.google.com/docs/vertex-ai/gemini-models#capabilities-features-comparison

MrCsabaToth commented 1 week ago

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

The question is if Google AI or Firebase Vertex AI package supports those.

List of Vertex AI embedding models regardless of SDK: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions#embeddings_stable_model_versions

MrCsabaToth commented 1 week ago

It'd be good to try preview models. I tried latest before and that was a step back. We should also test experimental models, there's a fresh 0827 experimental version for 1.5 Pro and 1.5 Flash: https://ai.google.dev/gemini-api/docs/models/experimental-models#available-models

MrCsabaToth commented 1 week ago

Firebase Vertex AI Flutter setup: https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

MrCsabaToth commented 1 week ago

Note that https://pub.dev/packages/googleai_dart is by langchain Dart, but google_generative_ai almost achieved feature feature parity now so googleai_dart will be retired.

MrCsabaToth commented 1 week ago

Firebase Vertex AI Flutter setup: https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: https://github.com/firebase/flutterfire/issues/13269

92caxj.jpg

MrCsabaToth commented 1 week ago

Go could have similar issues: https://discuss.ai.google.dev/t/text-multilingual-embedding-002-is-not-found-for-api-version-v1beta/4721/3

MrCsabaToth commented 1 week ago

Firebase Vertex AI Flutter setup: https://firebase.google.com/docs/vertex-ai/get-started?platform=flutter

Code sample: https://firebase.google.com/docs/vertex-ai/locations?platform=flutter#code-samples

Unfortunately so far Firebase Vertex AI 404s for embedContent: https://github.com/firebase/flutterfire/issues/13269

MrCsabaToth commented 1 week ago

The workaround will be a cloud function, we'll have to establish another anyway for reranking as well #39