transferwise / pipelinewise-tap-mongodb

Singer.io Tap for MongoDB - PipelineWise compatible
GNU Affero General Public License v3.0
5 stars 24 forks source link

LOG_BASED import : no documents imported when previous state is empty #83

Open je-kr opened 2 years ago

je-kr commented 2 years ago

Hello,

When using the following command :

tap-mongodb --config pipelinewise-tap-mongodb/config.json --catalog pipelinewise-tap-mongodb/catalog.json | target-postgres --config pipelinewise-target-postgres/config.json

I'm getting the following output :

time=2022-10-19 14:11:35 name=tap_mongodb level=INFO message=No change streams after 1000, updating bookmark and exiting...
time=2022-10-19 14:11:35 name=tap_mongodb level=INFO message=Syncd 0 records for company-company

And the following state is generated :

{"bookmarks": {"company-company": {"last_replication_method": "LOG_BASED"}}}

No documents are imported ...

The same is happening when i do a FULL_IMPORT followed by a LOG_BASED import, despite the following message : time=2022-10-19 14:16:31 name=tap_mongodb level=INFO message=Replication method changed from FULL_TABLE to LOG_BASED, will re-replicate entire collection company-company

I'm pretty much stuck at this point and out of ideas. Would appreciate any help/suggestions, thank you !

crowemi commented 2 years ago

@je-kr -- I had this same issue and created a LOG_BASED flag which triggers a full-load when state is not found. I opened a pr #84 based off this.

tharwan commented 2 years ago

Do I understand correctly that this bug means I can not setup any new log based replication with mangodb?

crowemi commented 2 years ago

No - you can setup LOG_BASED replication against MongoDB and the tap will pull data from the oplog starting from the point you initiated processing.

The issue is there isn't currently a way to extract data prior to the first execution of LOG_BASED processing, with future executions using LOG_BASED processing.

tharwan commented 2 years ago

Thanks for clearing that up!