GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.11k stars 931 forks source link

Mongo to BQ - advanced filters #1691

Closed stankiewicz closed 1 week ago

stankiewicz commented 2 weeks ago

Mongo to BQ has poor support for filters - they are on dataflow side with help of UDFs. This makes it expensive to run for Mongo Database and is expensive to filter on dataflow side.

BSON filters are pushed to MongoDB which allows easy filtering, especially for daily ETLs.