richardwilly98 / elasticsearch-river-mongodb

MongoDB River Plugin for ElasticSearch
1.12k stars 215 forks source link

How to resume indexing after restarting elasticsearch service? #447

Open perannum opened 9 years ago

perannum commented 9 years ago

I have a colllection of over 50 M documents which I want to index in elasticsearch. However there comes a need to restart elasticsearch service which causes rivering process to stop. What's the workaround for this scenario? I am using ES 1.3.4 and mongodb river 2.0.2

richardwilly98 commented 9 years ago

The initial import uses the collection data once completed the river will use oplog.rs So you should wait until the initial import is completed before to restart ES.

perannum commented 9 years ago

Thanks. I am encountering a very strange issue now. The docs count in elasticsearch are increasing and decreasing countinously. The count is increased by 1 and decreased 2 or 1. And the overall count has increased quite a lot, more than the actual collection size. What could be the issue?

richardwilly98 commented 9 years ago

When the initial import is completed the river will read oplog.rs and apply the operations for the collection defined so it is potentially possible the count increase, decrease until it is completed.

Do you still see document count changing once this phase is completed?

perannum commented 9 years ago

yes. the count has increased to about 65 M documents , whereas my collection contains some 54 M documents. The docs count is increasing and decreasing by 1 or 2 digits, it seems like its stuck. screenshot from 2015-01-06 17 22 14 These are the counts ES is showing. Note that the docs count is like 54,574,622 or 54,574,621 or 54,574,620 .. and it keeps on doing that while count is increasing in index.

I also using marvel plugin.