firebase / extensions

Source code for official Firebase extensions
https://firebase.google.com/products/extensions
Apache License 2.0
886 stars 376 forks source link

πŸ› [firestore-bigquery-export] Error initializing raw change log table #2134

Open boywijnmaalen opened 1 month ago

boywijnmaalen commented 1 month ago

Describe your configuration

Describe the problem

in the last 30 days I can find 14 occurrence of this error:

Unhandled error Error: Error initializing raw change log table: Field data has type JSON, which is not supported for clustering.

image

I could find at least 1 (but won't rule out more) missing updates for Firestore documents in BQ.

fix: a manual edit of the FS doc in question resulted in a sync to BQ (as expected).

stack trace;

Unhandled error Error: Error initializing raw change log table: Field data has type JSON, which is not supported for clustering. at FirestoreBigQueryEventHistoryTracker.initialize (/workspace/node_modules/@firebaseextensions/firestore-bigquery-change-tracker/lib/bigquery/index.js:192:23) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /workspace/lib/index.js:134:5 at async /workspace/node_modules/firebase-functions/lib/common/providers/tasks.js:74:17

Steps to reproduce:

unsure how to reproduce as I do not fully understand the problem

Expected result

to have every update for relevant Firestore docs synced to BQ

Actual result

missing update(s) for Firestore docs in BQ

cabljac commented 1 month ago

This is strange to me, as we don't use the datatype JSON as far as I know, data is saved in a raw string.

Do you have any more info on your Clustering you're able to provide?

boywijnmaalen commented 1 month ago

Hi @cabljac,

thanks for your quick reply;

schema: image

(part of) table info; image

cabljac commented 1 month ago

Was this schema solely generated by the extension?

boywijnmaalen commented 1 month ago

I don't really remember (the use of this extension has been in place for over 3 years)

tbh, does it matter? clustering by JSON type fields is not at all possible in BQ. our two JSON fields (data, old_data) are therefore not part of the defined clustered fields.

the error does not make sense to me. it seems somehow internally the data/old_data fields are used in such a way that is incompatible with JSON type.

what also strikes me as odd, is the fact the errors seems to be randomly appearing, if it really was related to JSON types fields then I'd expect this error to occur for every firestore>bigquery event. but this is not the case.

boywijnmaalen commented 1 month ago

could this error be related to the fact we updated the extension to version 0.1.51?

I'm unsure when we updated this extension however it must have been recently as this version was released on June 19th

cabljac commented 1 month ago

I'll have a look back at the release commits and see if anything could have caused this to start happening. Thanks for providing all this info by the way!

I'll see if we can get this issue prioritised

boywijnmaalen commented 1 week ago

@cabljac

wondering if you have an update on this?

boywijnmaalen commented 5 hours ago

@cabljac

a kind reminder πŸ™πŸΎ

do you have an update on a potential fix?