firebase / extensions

Source code for official Firebase extensions
https://firebase.google.com/products/extensions
Apache License 2.0
882 stars 374 forks source link

🐛 [Vector Search with Firestore] Error creating firestore Vector index. backfillTrigger() fail #2102

Closed karloti closed 1 month ago

karloti commented 1 month ago

The backfillTrigger() function fails if the specified text field to create the embed is null. In my case I can't set make text filed to be empty or any other text because it will create an incorrect embedding in the vector database with it. This text field is undergoing further processing, which may take several hours.

I would prefer that you handle fields that are null so that the function doesn't fail.

Currently, for this reason, the installation process of this extension does not stop.

I could change my field to blank to avoid the problem, but I think that should work for you.

Thanks for advance and the email support.

pr-Mais commented 1 month ago

Hello @karloti, can you open this issue in this repository https://github.com/GoogleCloudPlatform/firebase-extensions, as this extension is tracked there.

Moreover, could you provide your full configuration so we can attempt to re-produce the issue, thanks!

karloti commented 1 month ago

Hello @karloti, can you open this issue in this repository https://github.com/GoogleCloudPlatform/firebase-extensions, as this extension is tracked there.

Moreover, could you provide your full configuration so we can attempt to re-produce the issue, thanks!

I've already put in a lot of effort and I'm tired. I'm sorry I ran tests, wrote to support, opened a ticket, now you keep asking me to do things that I have already described many times I don't think it's appropriate In your case, just a specific function two problem in case the field value does not exist null. Please let someone do the support work instead of wasting our time. I spent 4 days on you and the error is trivial. In my case, this extension does not work and does not complete the installation successfully

pr-Mais commented 1 month ago

@karloti I'm sorry to hear that, for all extensions it's always better to start by opening an issue in the extension's repo as this is the fastest way we can get informed and fix it.

Could you please just share a screenshot from your configuration (redact sensitive information) to ensure my setup matches yours, I'm already investigating it.

pr-Mais commented 1 month ago

Hello @karloti, I couldn't re-produce, I had a field in my collection set to null, the extension completed processing all documents successfully. Here're my logs which shows the document being skipped as it's not valid. If you don't mind, could you provide more information (e.g. function logs from the function named ext-firestore-vector-search-backfillTask)? is my steps to reproduce the same as yours? Screenshot 2024-05-23 at 2 11 30 PM

pr-Mais commented 1 month ago

So the issue is not the field being null, but the creation of the index failing, which is a step before the backfill process starts. Is there any logs above the one you attached that states the reason why the index creation have failed?

pr-Mais commented 1 month ago

So the issue might be your extension's service account does not have enough permissions to create the index, however the extension should have these permissions by default, so could you check the service account here and see if it has all roles matching the screenshot Screenshot 2024-05-23 at 2 48 25 PM

cabljac commented 1 month ago

Hi @karloti, could you also provide your extension configuration params (redacting anything sensitive) just to ensure we aren't missing anything?

pr-Mais commented 1 month ago

The image isn't showing, can you upload it directly in a reply in the issue itself https://github.com/firebase/extensions/issues/2102.

pr-Mais commented 1 month ago

Sorry, I still see the image name only Screenshot 2024-05-23 at 2 59 56 PM

karloti commented 1 month ago

image image

cabljac commented 1 month ago

Are any of the resources deployed in a VPC by any chance? Trying to think where this permissions error is coming from

karloti commented 1 month ago

My account is managed by Google Workspace

karloti commented 1 month ago

image

pr-Mais commented 1 month ago

every function in this extension will log the configuration values in every excution, so if you check the logs of any function you could see something like this (expand jsonPayload to see the config), could you check if projectId is set to your project ID? Screenshot 2024-05-23 at 3 23 44 PM

karloti commented 1 month ago

image

karloti commented 1 month ago

image This part with embedding work, but ... extension have problem with Vector Database

pr-Mais commented 1 month ago

The search is failing cause there’s no index, which failed to create due to the permission error. We will be investigating this further. On 23 May 2024 at 3:55 PM +0300, Kaloyan Karaivanov @.***>, wrote:

image.png (view on web) This part with embedding work, but ... extension have problem with Vector Database — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

karloti commented 1 month ago

image

karloti commented 1 month ago

Problem solved.

I uninstalled the current extension and installed it on top of another collection for which all incoming text fields for embedding are populated. The difference between the two collections is that in the first one that was failing, I had data that might have caused a serialization problem (timestamp), as well as Blob fields that were over 10k. The input field (text) in the first collection was not always initialized and had NULL values. In the second collection, all text fields are not NULL, but are exactly the same as the first collection. image image image Although I could not test the function well in cloud console, but I will try to call it from Firebase Client SDK with service account. The important thing is that the installation completed normally. I still don't know how well this feature will do with Prefilters. I have arrays of strings in my collection that I want to filter. This can be difficult to realize. I don't want the function to return results that are not pre-filtered.

cabljac commented 1 month ago

Hey, glad to hear it's working correctly with a fresh install. I'm still confused as I don't think the null/uninitialised values are what was causing the permissions error.

Did you try reinstalling the extension on the original collection? I'm thinking perhaps the roles for the extension service account weren't correctly applied

karloti commented 1 month ago

I am about to test the function in the program and how to use it in RAG. If there is time, I will try it in the original collection, but I am limited by time. I was currently using my account through the console. I'll check with the service account. I might have missed something. image

karloti commented 1 month ago

When serializing records from my collection, I get a warning that I'm not serializing the new two fields (embedding & status). What exactly is Vector<768>? FloatArray(768), .. DoubleArray(768)?

Kotlin

pr-Mais commented 1 month ago

@karloti sorry, I'm not really familiar with Kotlin. Vector is a type defined by Firestore clients, so I'm assuming the Kotlin client has it, check if you can import it as a type for your field instead of using built-in types. However, I don't believe you need to serialize these 2 fields since you won't need them in the front-end. You might as well not serialize (skip) them.

karloti commented 1 month ago

image image

pr-Mais commented 1 month ago

@karloti can you confirm if this issue is resolved for you? if you encounter any other problems feel free to open a new issue here https://github.com/GoogleCloudPlatform/firebase-extensions