Open zdenkoimrek opened 5 years ago
Hi @zdenkoimrek for this type of complex scenario I suggest you go with a monstache go plugin. This enables you to control all aspects of the indexing. I'm not sure there is a good way to do what you are looking for using the aggregation framework on change events.
See https://github.com/rwynn/monstache/issues/181#issuecomment-468719179 for an example of mapping each MongoDB document to multiple index requests into Elasticsearch.
Thanks for quick reply. I am thinking to process this in javascript is it possible to return in javascript function array or only single object.
@zdenkoimrek unfortunately for javascript only a 1 to 1 mapping is supported. Process
function in a golang plugin is the escape hatch for doing more complicated things. I would rather not try to put all scenarios into the core library.
I am thinking to take different approach using only direct read namespaces and aggregation with unwind as it works, and use exit after direct reads and resume set to true. Will this do direct reads, exit monstache and docker container and once I restart docker container it will resume from last timestamp saved? This would be simplest for me to achieve. Then i could restart monstache docker container every time I do updates to mongo db. I expect updates to be on daily basis so it should not be problem.
I done some changes and would need your advice. Here is my current setup and questions:
mongo-url = "mongodb://root-user:password@mongodb:27017" elasticsearch-urls = ["http://elasticsearch:9200"] gzip = false elasticsearch-max-conns = 4 elasticsearch-max-seconds = 5 elasticsearch-max-bytes = 8000000 elasticsearch-max-docs = 1 elasticsearch-version = "6.5.0" verbose = true resume = false exit-after-direct-reads = false
disable-change-events = true direct-read-namespaces = ["test.teacher"]
[[mapping]] namespace = "test.teacher" index = "school" type = "school"
[[pipeline]] script = """ module.exports = function(ns, changeStream) { return [{ $unwind: "$students"},{$project:{_id: '$students._id', teacherId: '$_id',id: '$students._id',birthdate: '$students.birthdate',firstname: '$students.firstname',lastname: '$students.lastname'}}]; }
My model: teacher is main collection and it has nested collection students. Current setup as it is described above works but only for direct-read-namespaces. Without using Go lang plugin and javascript, what are my options to make change-stream-namespaces to work? I tried to use view but it looks like unwind does not work with view either. I am thinking to make it cron job but if I set exit after direct reads container gets stuck in restarting. I am also thinking to set students as main collection and for teacher use DbRef, this is my last option. What would be best approach to make realtime syncing.
I am using MongoDB 4.0.9 ,ElasticSearch 6.5 and Monstache 4.17.2 Everything is in docker images.
Hi rwynn. I am having dificulties to do aggregation using $unwind and $projection to return nested collection. I have situation similar to student is parent object and it has subjects as nested objects and in elasticsearch i have index subjects and in mongodb students. How can i map nested collection to elasticsearch index?