airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.75k stars 3.79k forks source link

Source MongoDB: Support Multidocument transactions #11063

Open infina-calvin opened 2 years ago

infina-calvin commented 2 years ago

Tell us about the problem you're trying to solve

We currently use MongoDB as the production database of our app, but our reporting is in Bigquery. We ran into an issue with Fivetran last year, where fivetran could not update any Mongo data that had multidocument transactions. Because Fivetran (and any other tools we looked into) did not support multidocument transactions, we had to write our own script to move data from mongo to bigquery. Is it possible of Airbyte to support multi-document transactions in their MongoDB connector?

Describe the solution you’d like

A clear and concise description of what you want to see happen, or the change you would like to see

Describe the alternative you’ve considered or used

A clear and concise description of any alternative solutions or features you've considered or are using today.

Additional context

Add any other context or screenshots about the feature request here.

Are you willing to submit a PR?

Remove this with your answer :-)

marcosmarxm commented 2 years ago

thanks for requesting this @infina-calvin. Can you describe your use case or why the multidocument transaction is important for you?

marcosmarxm commented 2 years ago

Today Airbyte mongoDB first save data into a tmp collection and after try to create the collection and run insertMany operation which doesn't have the option in transaction for Mongodb. Probably adding the multidocument transaction is possible to Airbyte but you'll have a lower performance.

grishick commented 1 year ago

@infina-calvin @marcosmarxm Is this issue about writing to MongoDB destination or reading from MongoDB source?

infina-calvin commented 1 year ago

This is about reading from MongoDB source.

Replicating mongoDB multidocument transactions using airbyte has not been an issue for our use case so far.

gjermundgaraba commented 1 year ago

Does this still not work? We had the same problem at stitch, so we had to replicate all the data on every run, which is super expensive. It seems to be related to the way oplog works with transactions or something.