yougov / mongo-connector

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
Apache License 2.0
1.88k stars 478 forks source link

namespaces (nested) fields are not handled properly when of MongoDB type ReferenceField #682

Open tafaust opened 7 years ago

tafaust commented 7 years ago

Hey,

I have the following fully dockerized setup: mongo-connector is up and running, keeping MongoDB and Elasticsearch in-sync (thanks at this point!). The configuration looks as follows:

   "mainAddress": "mongodb://mongodb:27017",
   "oplogFile": "/var/log/mongo-connector/oplog.timestamp",
   "noDump": false,
   "batchSize": -1,
   "verbosity": 2,
   "continueOnError": false,
   "namespaces": {
      "db.language": true,
      "db.tag": {
         "includeFields": ["name"]
      },
      "db.blogpost": {
         "includeFields": ["name", "content", "languages", "tags.name"]
      },
      "db.user": {
         "includeFields": ["username", "description"]
      },
      "db.comment": {
         "includeFields": ["text"]
      }
   },
   "logging": {
      "type": "file",
      "filename": "/var/log/mongo-connector/mongo-connector.log",
      "format": "%(asctime)s [%(levelname)s] %(name)s:%(lineno)d - %(message)s",
      "rotationWhen": "D",
      "rotationInterval": 1,
      "rotationBackups": 10
   },
   "docManagers": [
      {
         "docManager": "elastic2_doc_manager",
         "targetURL": "elasticsearch:9200"
      }
   ]
}

In this scenario, the document Tag is implemented in the blogpost as follows tags = ListField(ReferenceField('Tag')). The ReferenceField hence does not have an attribute name and thus "tags.name" is not getting synced to Elasticsearch.

Do you need more information on this issue? Or any ideas? I am looking forward your answer, thanks!

tafaust commented 7 years ago

FYI I also posted this question to SO: http://stackoverflow.com/q/42722838/2402281