firebase / extensions

Source code for official Firebase extensions
https://firebase.google.com/products/extensions
Apache License 2.0
882 stars 373 forks source link

🐛 [Stream Firestore to BigQuery 0.1.48] Export no longer works #2048

Closed GD-MarcoBug closed 2 months ago

GD-MarcoBug commented 2 months ago

Hello everyone, the Firestore export to BigQuery no longer works for me since version 0.1.48. No more data is transferred. When I set up a new customer, the dataset Firestore_Export is still created, but no export tables are created. With the previous versions, everything worked without any problems.

The basic configuration settings are the same. Is this a known problem?

Best regards Marco

pr-Mais commented 2 months ago

When I set up a new customer, the dataset Firestore_Export is still created, but no export tables are created. With the previous versions, everything worked without any problems.

Could you please elaborate more? Do you mean when you install a new instance of the extension, the export table isn't created hence no changes are recorded?

j05u3 commented 2 months ago

same, no export tables created and for me it says processing for hours

image
pr-Mais commented 2 months ago

@j05u3 thanks for sharing your issue, we're looking into it.

olegdater commented 2 months ago

exactly same here, Dataset is created, but tables inside are not created, then I get errors like this in the logs when extension tries to insert data:

Unhandled error ApiError: Not found: Table project_id:firestore_export.messages_raw_changelog at new ApiError (

jauntybrain commented 2 months ago

Hi @j05u3 and @GD-MarcoBug, thank you for reporting this issue. Could you please share the configuration settings of your installed Stream Firestore to BigQuery extensions? Feel free to redact any sensitive information.

pr-Mais commented 2 months ago

Also wondering if you installed a new instance of the extension or updated from a previous version?

GD-MarcoBug commented 2 months ago

Hi @pr-Mais and @jauntybrain, first of all, many thanks for the quick support. The problem occurs both with a new instance and when updating to the latest version. The following is a basic configuration, for new instances, it gets stuck in processing (last screenshot):

Bildschirmfoto 2024-04-16 um 09 11 51 image image image
GD-MarcoBug commented 2 months ago

For your information. The problem also occurs with 0.1.44. Since 20240412

joeframpton commented 2 months ago

I found that when I tried to stream from firestore in project A to bigquery in project B, I had to manually add the function's service account permissions in project B. The extension would then be able to add the dataset and table in bigquery. However, the extension would be stuck on processing until I manually added the required routines in BigQuery too.

olegdater commented 2 months ago

@joeframpton what are those "the required routines in BigQuery " and how to add them? eager to make it work again 😀

cabljac commented 2 months ago

@GD-MarcoBug do you know if the extension works on versions before 0.1.44?

GD-MarcoBug commented 2 months ago

The customers I had updated to 0.1.44 are working. Customers for which I have new created the extension with 0.1.44 do not work. A customer update to 0.1.45 also ran smoothly. An update to 0.1.48 has cut the connection and stucks in processing. New customer with 0.1.48 also stuck in processing. I have no customers with an older version than 0.1.44.

The service account is still created correctly. The dataset in BigQuery is also created. The creation of the tables then does not work.

You should be able to reproduce the error by creating a new Firestore project. Load data into it and export it with the latest extension version. I can also imagine that the extension is stuck because Firestore is constantly receiving new data and the extension may not be able to process this fast enough?! this worked before

cabljac commented 2 months ago

Investigating this now, testing a fix. Thanks for your patience!

krishnamshah commented 2 months ago

Same issue here. Creation of the tables does not work.

joeframpton commented 2 months ago

@olegdater

@joeframpton what are those "the required routines in BigQuery " and how to add them? eager to make it work again 😀

Hi, so the workaround I managed to implement (which autoinstalled the routined for me - alternatively you can install them following this: https://cloud.google.com/bigquery/docs/samples/bigquery-create-routine) was the following:

  1. Run the fs-bq-import-collection script across projects (although it failed too - Issue raised here). This created the dataset and table in bigquery in the second project
  2. Imported a schema using fs-bq-schema-views to the bigquery in the second project
  3. Installed this extension to set up the stream across projects - and then manually added the service account from this extension to have BigQuery Data Editor & Cloud Datastore User in the second project
  4. Run the fs-bq-import-collection script to import the data to bigquery in the original project
  5. Set up the service account from the second project to have permissions in the first project
  6. In Big Query in the second project, access the data from the original project's big query and merge into the data that's being streamed from step 3

That's what worked for me! There might be superfluous steps in there but I hope this is helpful

Matan commented 2 months ago

:wave: eagerly waiting for the fix, thanks for working on it!, in the meantime, is possible to run import script multiple times to update existing / updated documents? I'm running it but it skips over existing updated documents in my collections. I'm effectively looking for an "upsert" mechanic to hold me over until the extension is fixed.

I've also removed the extension and reinstalled the last known good extension version (0.1.42), but no updates are making it through to BigQuery.

Thanks, any help will be appreciated.

krishnamshah commented 2 months ago

Thank you for the prompt resolution. When will the fix be deployed to production?

cabljac commented 2 months ago

Today hopefully!

olegdater commented 2 months ago

firebase/firestore-bigquery-export@0.1.49

I still see same problem

Runtime Status: Processing Configuring BigQuery Sync.

table not created and errors in console: Unhandled error ApiError: Not found: Table

Screenshot 2024-04-20 at 07 11 13
cabljac commented 2 months ago

Strange, it had fixed it for me. I will investigate further.

cabljac commented 2 months ago

I believe it could be an issue with the onConfigure logic but I won't get a chance to look at it and release until Monday, most likely. If it's urgent you could try installing a fresh instance of the extension as a workaround.

cabljac commented 2 months ago

It may be intended behaviour not to create a table onConfigure actually, you might need to install a fresh instance to create the table

olegdater commented 2 months ago

i was creating a new extension installation for 0.1.49, not just refreshing already existing one

or what do you mean "install a fresh instance to create the table"?

cabljac commented 2 months ago

In that case I'll try and reproduce it again tomorrow and see what's going wrong. Must be something else up with it, as the RC worked yesterday. Thanks for your patience

cabljac commented 2 months ago

@olegdater I don't believe you can have multiple firestore fields in a comma separated list, for

"Firestore Document field name for BigQuery SQL Time Partitioning field option (Optional)"

Our regex validation will need to be changed to reflect this, right now that field has no validation. I will open a new issue to track this.

If you're still facing this issue with a valid config, feel free to open a new issue to track :)

olegdater commented 2 months ago

thanks @cabljac ! confirmed, it now works 🎉