GoogleCloudPlatform / firebase-extensions

Apache License 2.0
74 stars 36 forks source link

Firstore vector search hangs after backfilling first 50 entries #523

Open TTrapper opened 3 months ago

TTrapper commented 3 months ago

[REQUIRED] Step 2: Describe your configuration

[REQUIRED] Step 3: Describe the problem

The extension installs and starts to backfill, but it gets stuck after 50 items and just hangs there forever. I've been back and forth with google cloud support for a couple of weeks and they sent me here. Here is a screenshot from the firestore UI:

image

I've also commented on this seemingly related thread: https://github.com/GoogleCloudPlatform/firebase-extensions/issues/360#issuecomment-2170458164

I don't see an index created under Vertex AI vector search (I assume that's where it would go).

The logs for the funtion ext-firestore-vector-search-backfillTask don't show any error:

2024-05-25 13:02:08.707 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Handling 50 documents 
2024-05-25 13:02:09.780 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Handling 50 documents 
2024-05-25 13:02:18.423 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Task ext-firestore-vector-search-task-1 completed with 50 success(es) 
2024-05-25 13:02:18.540 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Current state: 50 processed, 0 skipped, 0 failed out of 3233 total tasks 
2024-05-25 13:02:18.540 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Enqueuing the next task ext-firestore-vector-search-task-0 
2024-05-25 13:02:18.776 ADT
ext-firestore-vector-search-backfillTaskbya48ul1zogw Function execution took 10142 ms, finished with status code: 204 
2024-05-25 13:02:22.416 ADT
ext-firestore-vector-search-backfillTask Initializing extension with configuration 
2024-05-25 13:02:22.455 ADT
ext-firestore-vector-search-backfillTask4ngpj6181pr6 Function execution started 
2024-05-25 13:02:22.536 ADT
ext-firestore-vector-search-backfillTask4ngpj6181pr6 Handling task ext-firestore-vector-search-task-0 
2024-05-25 13:02:22.536 ADT
ext-firestore-vector-search-backfillTask4ngpj6181pr6 No data to handle, skipping... 
2024-05-25 13:02:22.560 ADT
ext-firestore-vector-search-backfillTask4ngpj6181pr6 Function execution took 104 ms, finished with status code: 204

Steps to reproduce:

Expected result

Vectors are computed for all entries in the database

Actual result

Vectors are computed for the first 50 entries and then it hangs.

cabljac commented 3 months ago

Thanks! will look into this asap

mfrashad commented 3 months ago

had the same issue on being stuck at 50 for backfill.

updateTrigger and updateTask work fine tho, so current workaround for me is to just update the documents in the backfill to trigger it

TTrapper commented 1 month ago

Any updates on this? I'm still getting the same behavior.

TTrapper commented 1 month ago

I'm posting an update because the behaviour looks a little different now:

1) The _firestore-vector-search index still shows RUNNING with only 50 docs processed, and the rest queued for BACKFILL. 2) When I run the ext-firestore-vector-search-queryCallablefunction I get results from more than just the first 50 documents 3) Many more documents now have a vector field (not just the first 50), but there are still many that don't have it. There doesn't seem to be an ordering to this: documents without the vector field are randomly dispersed among documents that that don't have a vector field.