Open MorenoMdz opened 8 months ago
@MorenoMdz I tried reproducing but without luck, can you check the logs in the function fsexportbigquery
, do you see any errors?
@MorenoMdz I tried reproducing but without luck, can you check the logs in the function
fsexportbigquery
, do you see any errors?
We do have a bunch of "Cannot partition an existing table firestore_export_jobs_raw_changelog" warnings, but that's very common on the FS BQ exporter, no errors tho.
As I mentioned we are heavy users of the exporter since 2021 but this was the first collection the extension itself did the backfill, we always used the backfill script previously, so I would point towards an issue in the backfill itself. If you check the Firestore key visualizer you will notice the reads per second bubbled up over the next hours in something that looked recursively/exponentially called.
I think we have a clue why this might be happening. The backfilling function uses offset
to enqueue the batches to import sequentially, this explains the burst in reads you had. Thanks for the details you provided, we will disable this feature until we come up with a solution.
I think we have a clue why this might be happening. The backfilling function uses
offset
to enqueue the batches to import sequentially, this explains the burst in reads you had. Thanks for the details you provided, we will disable this feature until we come up with a solution.
I see! Thanks for the quick response.
I can also confirm I had the same issue on a fresh project, total docs 33k, docs per backfill 200, total reads after backfill done is 2.7M.
[REQUIRED] Step 2: Describe your configuration
[REQUIRED] Step 3: Describe the problem
We have been heavy users of this extension for over 2 years it has provided great value and near flawless replication from our FS data to BQ, until March 19th we did setup the exporter in our smaller collection, jobs, that had at the time under 800k documents. The thing is that the exporter ended up running for almost 2 days, in a pattern that seemed recursive, and sometimes reading over 9k documents per second, notice how the Docs per backfill setting was set to 200, and this ended up causing us a 1k U$ spike in billing for that one day.
It is easy to see the read spikes bubbling up and slowly fading until they stopped over one day later from the FS key visualizer.
I do have a ticket open about this issue, but I do think this should be reported here.
Note, this was the first time we used the "backfill" option from the FB console, before that we always backfilled by hand with the good old script.
I cannot provide more information as it would be sensitive, but the internal GCP ticket is 50277733
Expected result
Under 50 cents of billing for this export. And the export should have taken a couple minutes.
Actual result
Over one thousand dollars was charged. And the export took almost two days.