singer-io / tap-mongodb

GNU Affero General Public License v3.0
28 stars 38 forks source link

bson.errors.InvalidBSON: 'utf-8' codec can't decode byte 0xdc in position 70: invalid continuation byte #92

Open helgetan opened 2 years ago

helgetan commented 2 years ago

When Taping into my mongodb i get the following error out of the sudden. Any Ideas how to debug this, as i have no clue which field could be non UTF-8. Tap worked fine all the time and now crashes, any idea?


2022-11-04 14:04:25,365Z    tap - Traceback (most recent call last):
2022-11-04 14:04:25,365Z    tap -   File "tap-env/bin/tap-mongodb", line 11, in <module>
2022-11-04 14:04:25,365Z    tap -     load_entry_point('tap-mongodb==1.1.0', 'console_scripts', 'tap-mongodb')()
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/__init__.py", line 389, in main
2022-11-04 14:04:25,365Z    tap -     raise exc
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/__init__.py", line 386, in main
2022-11-04 14:04:25,365Z    tap -     main_impl()
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/__init__.py", line 382, in main_impl
2022-11-04 14:04:25,365Z    tap -     do_sync(client, args.catalog.to_dict(), state)
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/__init__.py", line 343, in do_sync
2022-11-04 14:04:25,365Z    tap -     sync_stream(client, stream, state)
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/__init__.py", line 321, in sync_stream
2022-11-04 14:04:25,365Z    tap -     oplog.sync_collection(client, stream, state, stream_projection)
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_mongodb/sync_strategies/oplog.py", line 142, in sync_collection
2022-11-04 14:04:25,365Z    tap -     for row in cursor:
2022-11-04 14:04:25,365Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/pymongo/cursor.py", line 1225, in next
2022-11-04 14:04:25,365Z    tap -     if len(self.__data) or self._refresh():
2022-11-04 14:04:25,366Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/pymongo/cursor.py", line 1162, in _refresh
2022-11-04 14:04:25,366Z    tap -     self.__send_message(g)
2022-11-04 14:04:25,366Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/pymongo/cursor.py", line 1006, in __send_message
2022-11-04 14:04:25,366Z    tap -     legacy_response=legacy_response, user_fields=user_fields)
2022-11-04 14:04:25,366Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/pymongo/cursor.py", line 1097, in _unpack_response
2022-11-04 14:04:25,366Z    tap -     legacy_response)
2022-11-04 14:04:25,366Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/pymongo/message.py", line 1470, in unpack_response
2022-11-04 14:04:25,366Z    tap -     self.payload_document, codec_options, user_fields)
2022-11-04 14:04:25,366Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/bson/__init__.py", line 993, in _decode_all_selective
2022-11-04 14:04:25,366Z    tap -     return decode_all(data, codec_options)
2022-11-04 14:04:25,366Z    tap - bson.errors.InvalidBSON: 'utf-8' codec can't decode byte 0xdc in position 70: invalid continuation byte```