CsabaConsulting / InspectorGadgetApp

Open Multi-Modal Personal Assistant
MIT License
3 stars 1 forks source link

Switch from google_generative_ai package to firebase_vertexai (and from BYO API key to BYO Firebase project) #53

Open MrCsabaToth opened 3 days ago

MrCsabaToth commented 3 days ago

We experimented with https://pub.dev/packages/firebase_vertexai/ regarding multilingual embedding #48. That didn't come to fruition (https://github.com/google-gemini/generative-ai-dart/issues/209 and https://github.com/firebase/flutterfire/issues/13269), however lately it also turned out that https://github.com/google-gemini/generative-ai-dart/ doesn't support file upload (https://github.com/google-gemini/generative-ai-dart/issues/211 and https://github.com/google-gemini/generative-ai-dart/issues/70). This is crucial because audio and video multi modalities (and possibly also PDF and others except image) need file upload instead of inline data (#38 and https://discuss.ai.google.dev/t/gemini-1-5-refuses-to-process-audio-files/39713/5?u=tocsa).

Firebase offers file upload unrelated to AI for a long time now, so we'll make the leap of faith and convert over. For someone to kickstart and replicate this project would need to establish two cloud functions anyway (for Chirp / STT and TTS), so they'd need to deal with more than just an AI Studio (ex MakerSuite) API Key. With multilingual embedding and reranking we'll have two more cloud functions and establishing this will be just simpler on Firebase then in the "big boy" vertex AI (you know someone needs to establish service accounts, roles and all nine yards).

MrCsabaToth commented 2 days ago

This is the way! https://github.com/google-gemini/generative-ai-dart/issues/70#issuecomment-2364748909 Gemini_Generated_Image_34cwwt34cwwt34cw

MrCsabaToth commented 1 day ago

Note that this is against the direction of off-device working, but Firestore now supports vector DB and vector search: https://cloud.google.com/firestore/docs/vector-search

Also note that we perform a dimensionality reduction with folding (instead of truncation) which currently leads to non normalized vectors. This means that dot product (the potentially most cost effective distance) is not a valid distance any more https://cloud.google.com/firestore/docs/vector-search#choose-distance-measure So maybe we should normalize after the folding?

MrCsabaToth commented 10 hours ago

TODO: enforce App Check on functions, convert them? https://firebase.google.com/docs/app-check/cloud-functions?hl=en

MrCsabaToth commented 10 hours ago

Dealing with two errors right now:

  1. "The caller does not have permission" server side fail when trying to handle modalities persisted in Firebase Storage
  2. "Please ensure that function call turn comes immediately after a user turn or after a function response turn." when trying function calls

So far many steps back compared to https://pub.dev/packages/google_generative_ai, many lost features!

MrCsabaToth commented 9 hours ago

Not seeing what we are missing regarding Firebase storage permissions: https://firebase.google.com/docs/vertex-ai/solutions/cloud-storage?platform=flutter Side note: https://stackoverflow.com/questions/77758177/how-can-i-send-files-to-googles-gemini-models-via-api-call

MrCsabaToth commented 9 hours ago

After temporarily granting all read access I got a different error: "Service agents are being provisioned (https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents). Service agents are needed to read the Cloud Storage file provided. So please try again in a few minutes."

MrCsabaToth commented 9 hours ago

I also ran into "Unable to submit request because function parameters schema should be of type OBJECT. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling" but I refactored to eliminate the two Local Tool: Location and HRM which were the only one not having SchemaType.Object: https://github.com/google-gemini/generative-ai-dart/issues/194#issuecomment-2366882023

MrCsabaToth commented 8 hours ago

Even after I manually provide Storage Object Viewer rights to the AI Platform / Vertex AI service agent, I get "The caller does not have permission":

Go to the GCP Storage page related to the Firebase Storage: https://console.cloud.google.com/storage/browser/{PROJECT_NAME}.appspot.com

  1. Go to the Permissions tab
  2. Under the View Principals tab click Grant Access
  3. Under the Add Principals in the New principals field type service-{PROJECT_NUMBER}@gcp-sa-aiplatform.iam.gserviceaccount.com
  4. The principal will be singled oout and found, click on the found principal
  5. In the roles type "aiplatform.serviceAgent", click on the found role
  6. Click Save
  7. Add Storage Object Viewer Role to that service account.

So far I add read access to the public as a workaround.