🐛 [firestore-bigquery-export] backfill for 2.3M documents cost $400,000

williamkolean commented 7 months ago

[READ] Step 1: Are you in the right place?

Issues filed here should be about bugs for a specific extension in this repository. If you have a general question, need help debugging, or fall into some other category use one of these other channels:

For general technical questions, post a question on StackOverflow with the firebase tag.
For general Firebase discussion, use the firebase-talk google group.
To file a bug against the Firebase Extensions platform, or for an issue affecting multiple extensions, please reach out to Firebase support directly.

[REQUIRED] Step 2: Describe your configuration

Extension name: firestore-bigquery-export
Extension version: 0.1.45
Configuration values (redact info where appropriate): BigQuery Dataset location us BigQuery Project ID redacted Collection path collection/{wildcard}/collection/{wildcard}/collection/{wildcard}/collection Enable Wildcard Column field with Parent Firestore Document IDs (Optional) true Dataset ID redacted Table ID redacted BigQuery SQL table Time Partitioning option type (Optional) NONE BigQuery Time Partitioning column name (Optional) Parameter not set Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional) Parameter not set BigQuery SQL Time Partitioning table schema field(column) type (Optional) omit BigQuery SQL table clustering (Optional) Parameter not set Maximum number of synced documents per second (Optional) 100 Backup Collection Name (Optional) Parameter not set Transform function URL (Optional) Parameter not set Use new query syntax for snapshots yes Exclude old data payloads (Optional) yes Import existing Firestore documents into BigQuery? yes Existing Documents Collection (Optional) collection/{wildcard}/collection/{wildcard}/collection/{wildcard}/collection Use Collection Group query (Optional) yes Docs per backfill 100 Cloud KMS key name (Optional) Parameter not set

[REQUIRED] Step 3: Describe the problem

We originally did a backfill with 200 docs per backfill, and it finished quickly but didn't include all documents. So we lowered to 100 docs per backfill to match the max number of synced documents. This time the backfill was a lot slower than before, so we just let it run. Unfortunately it seems each iteration was a little slower than the previous iteration, until it reached a point where the tasks started timing out. Once that happened, it caused a rapid escalation in resource use because the task would retry 100 times and keep failing while new tasks continued to be created. Because this escalated over the weekend, when we checked the progress on Monday we saw a $400,000 billing charge when a normal month is less than $100.

Steps to reproduce:

Try to import documents with similar settings to above and check the time for each iteration to complete. Each iteration will take a little bit longer than the previous iteration.

Expected result

Redoing the import using fs-bq-import-collection (without wildcards to be on the safe side) we were able to import 300 docs a second, and it finished in under 5 hours.

Actual result

By the time the extension was killed there were 20k tasks in the queue and the cloud function logs were full of timeout errors and the read count was 662,466,290,104.

cabljac commented 7 months ago

Hi, we have raised this with the Firebase team, will get back to you ASAP on this.

filiocorp commented 5 months ago

@cabljac I know that you disabled the import feature but the script has many issues; I keep getting errors and it does not import all the documents. I have been running it multiple times and it only recorded 8000 documents out over 100,000

{"severity":"WARNING","message":"Error when inserting data to table."}
An error has occurred on the following documents, please re-run or insert the following query documents manually...

If I run it with batch size one, I get another error

{"severity":"WARNING","message":"Error when inserting data to table."}
An error has occurred on the following documents, please re-run or insert the following query documents manually... {}
{}

filiocorp commented 5 months ago

Update: I turned off the multithread and now it works.

firebase / extensions