yougov / mongo-connector

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
Apache License 2.0
1.88k stars 479 forks source link

Stuck on OplogThread: oplog checkpoint updated to Timestamp #445

Open Lennaert opened 8 years ago

Lennaert commented 8 years ago

We're trying out the mongo-connector plugin but ran into an issue.

After the initial import of all data (300.000 documents), it's stuck when tailing the oplog:

2016-05-11 13:26:16,717 [INFO] mongo_connector.connector:1060 - Beginning Mongo Connector
2016-05-11 13:26:16,830 [INFO] mongo_connector.oplog_manager:92 - OplogThread: Initializing oplog thread
2016-05-11 13:26:17,138 [INFO] mongo_connector.connector:296 - MongoConnector: Starting connection thread MongoClient(host=[u'xxxxxx'], document_class=dict, tz_aware=False, connect=True, replicaset=u'rs1')
2016-05-11 13:26:17,138 [DEBUG] mongo_connector.oplog_manager:194 - OplogThread: Run thread started
2016-05-11 13:26:17,138 [DEBUG] mongo_connector.oplog_manager:196 - OplogThread: Getting cursor
2016-05-11 13:26:17,139 [DEBUG] mongo_connector.oplog_manager:747 - OplogThread: reading last checkpoint as Timestamp(1462971007, 17) 
2016-05-11 13:26:17,139 [DEBUG] mongo_connector.oplog_manager:731 - OplogThread: oplog checkpoint updated to Timestamp(1462971007, 17)

After this line, nothing seems to happen. In the configuration we just focused on 1 namespace (database.collection). What could be the case?

Lennaert commented 8 years ago

This might be related that our oplog is rather big, 17 million documents.

I noticed the oplog_manager is querying for ts and ns. When there's no conditions (no namespace and timestamp == None), then it's running as intended. So it seems like our oplog querying is too slow. It's not possible to add indexes to ts and ns.

Any ideas?