MeltanoLabs / tap-mongodb

Apache License 2.0
6 stars 7 forks source link

LOG_BASED replication throws exception #18

Closed mw243270 closed 1 year ago

mw243270 commented 1 year ago

After initial run, then deleting a document from "users" collection, then rerunning an exception is thrown.

2023-08-08T13:25:11.564532Z [info     ] 2023-08-08 08:25:11,563 | INFO     | tap-mongodb          | Beginning log_based sync of 'comments'... cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:25:11.564898Z [info     ] 2023-08-08 08:25:11,564 | INFO     | tap-mongodb          | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:26:43.311483Z [info     ] 2023-08-08 08:26:43,310 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"stream": "comments", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:26:43.433359Z [info     ] 2023-08-08 08:26:43,432 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 91.86790776252747, "tags": {"stream": "comments", "context": {}, "status": "succeeded"}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:26:43.433813Z [info     ] 2023-08-08 08:26:43,433 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "comments", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:26:43.435067Z [info     ] 2023-08-08 08:26:43,433 | INFO     | tap-mongodb          | Beginning log_based sync of 'movies'... cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:26:43.435297Z [info     ] 2023-08-08 08:26:43,433 | INFO     | tap-mongodb          | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:27:50.133407Z [info     ] 2023-08-08 08:27:50,133 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"stream": "movies", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:27:50.179470Z [info     ] 2023-08-08 08:27:50,179 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 66.74488925933838, "tags": {"stream": "movies", "context": {}, "status": "succeeded"}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:27:50.179997Z [info     ] 2023-08-08 08:27:50,179 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "movies", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:27:50.180451Z [info     ] 2023-08-08 08:27:50,179 | INFO     | tap-mongodb          | Beginning log_based sync of 'sessions'... cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:27:50.180711Z [info     ] 2023-08-08 08:27:50,179 | INFO     | tap-mongodb          | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:28:55.350929Z [info     ] 2023-08-08 08:28:55,350 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"stream": "sessions", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:28:55.397140Z [info     ] 2023-08-08 08:28:55,396 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 65.21628999710083, "tags": {"stream": "sessions", "context": {}, "status": "succeeded"}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:28:55.397491Z [info     ] 2023-08-08 08:28:55,396 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "sessions", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:28:55.398473Z [info     ] 2023-08-08 08:28:55,397 | INFO     | tap-mongodb          | Beginning log_based sync of 'theaters'... cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:28:55.398692Z [info     ] 2023-08-08 08:28:55,397 | INFO     | tap-mongodb          | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:30:01.116888Z [info     ] 2023-08-08 08:30:01,116 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"stream": "theaters", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:30:01.165538Z [info     ] 2023-08-08 08:30:01,165 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 65.76761555671692, "tags": {"stream": "theaters", "context": {}, "status": "succeeded"}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:30:01.165863Z [info     ] 2023-08-08 08:30:01,165 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "theaters", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:30:01.166208Z [info     ] 2023-08-08 08:30:01,165 | INFO     | tap-mongodb          | Beginning log_based sync of 'users'... cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:30:01.166372Z [info     ] 2023-08-08 08:30:01,165 | INFO     | tap-mongodb          | Tap has custom mapper. Using 1 provided map(s). cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.107663Z [info     ] 2023-08-08 08:31:06,106 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 64.94067430496216, "tags": {"stream": "users", "context": {}, "status": "failed"}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.108346Z [info     ] 2023-08-08 08:31:06,107 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "users", "context": {}}} cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.108767Z [info     ] 2023-08-08 08:31:06,107 | ERROR    | tap-mongodb          | An unhandled error occurred while syncing 'users' cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.109099Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.109391Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1175, in sync cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.109691Z [info     ]     for _ in self._sync_records(context=context): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.110014Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1070, in _sync_records cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.110257Z [info     ]     for record_result in self.get_records(current_context): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.110584Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/tap_mongodb/streams.py", line 298, in get_records cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.111854Z [info     ]     document = record["fullDocument"] cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.112103Z [info     ] KeyError: 'fullDocument'       cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.112373Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.112622Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/bin/tap-mongodb", line 8, in <module> cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.112934Z [info     ]     sys.exit(TapMongoDB.cli()) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.113232Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.113663Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.114073Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.114330Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.114638Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.114888Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.115139Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.115434Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.115687Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/tap_base.py", line 498, in invoke cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.115937Z [info     ]     tap.sync_all()             cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.116188Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/tap_base.py", line 456, in sync_all cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.116437Z [info     ]     stream.sync()              cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.116688Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1182, in sync cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.116953Z [info     ]     raise ex                   cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.117211Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1175, in sync cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.117463Z [info     ]     for _ in self._sync_records(context=context): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.117710Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1070, in _sync_records cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.117960Z [info     ]     for record_result in self.get_records(current_context): cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.118161Z [info     ]   File "/home/markw/meltano/my-new-project/.meltano/extractors/tap-mongodb/venv/lib/python3.10/site-packages/tap_mongodb/streams.py", line 298, in get_records cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.118305Z [info     ]     document = record["fullDocument"] cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.118449Z [info     ] KeyError: 'fullDocument'       cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-08-08T13:31:06.541564Z [info     ] Incremental state has been updated at 2023-08-08 13:31:06.541326.
2023-08-08T13:31:06.550323Z [error    ] Extractor failed
2023-08-08T13:31:06.550693Z [error    ] Block run completed.           block_type=ExtractLoadBlocks err=RunnerError('Extractor failed') exit_codes={<PluginType.EXTRACTORS: 'extractors'>: 1} set_number=0 success=False
DouweM commented 1 year ago

We added a log statement to see what's actually in record:

{'_id': {'_data': '8264DD40C8000000102B022C0100296E5A1004BFB11A61829545889E436B0EE609948046645F6964006464DD40B62B7877E9524F8EC60004'}, 'operationType': 'delete', 'clusterTime': Timestamp(1692221640, 16), 'wallTime': datetime.datetime(2023, 8, 16, 21, 34, 0, 544000), 'ns': {'db': '6447dfeefb2d286779aa50dc_sample_mflix', 'coll': 'users'}, 'documentKey': {'_id': ObjectId('64dd40b62b7877e9524f8ec6')}}

Looks like for 'operationType': 'delete', there's no fullDocument.

This may work: https://github.com/menzenski/tap-mongodb/compare/main...DouweM:tap-mongodb:main-1

menzenski commented 1 year ago

Thanks @DouweM - I will test that today