1.6.0 is introducing a way to fully reindex ElasticSearch. Currently, the develop-branch handles reindexing via MongoDB's migrations. In other words, if the ES structure changes, a new MongoDB migration needs to be created to trigger the reindex.
The problem with this is that the same version might have multiple reindex-migrations when reindexing could be run only once. Also creating a reindex migration creates work that can be avoidable.
Improvement proposal
Run the ElasticSearch reindexing every time run-migrations.sh runs. The reindexing is fairly fast, and deployments are fairly scarce, so even if it sometimes runs without a distinctive need, it won't affect the deployments.
The risk is that when the database sizes grow, the reindexing can get slow. Per my testings, with ~1700 records it's ~6 seconds. That means 1 million records take 3529 seconds which is almost 1 hour. That is a long while but if done just a maximum of a few times a year, quite manageable.
If that becomes a problem, the reindexing should be able to be sharded to run for example 10 threads in parallel, as the records and ElasticSearch documents are self-contained.
Dev tasks
[x] Remove the migrations that reindex ElasticSearch
1.6.0 is introducing a way to fully reindex ElasticSearch. Currently, the develop-branch handles reindexing via MongoDB's migrations. In other words, if the ES structure changes, a new MongoDB migration needs to be created to trigger the reindex.
The problem with this is that the same version might have multiple reindex-migrations when reindexing could be run only once. Also creating a reindex migration creates work that can be avoidable.
Improvement proposal
Run the ElasticSearch reindexing every time
run-migrations.sh
runs. The reindexing is fairly fast, and deployments are fairly scarce, so even if it sometimes runs without a distinctive need, it won't affect the deployments.The risk is that when the database sizes grow, the reindexing can get slow. Per my testings, with ~1700 records it's ~6 seconds. That means 1 million records take 3529 seconds which is almost 1 hour. That is a long while but if done just a maximum of a few times a year, quite manageable.
If that becomes a problem, the reindexing should be able to be sharded to run for example 10 threads in parallel, as the records and ElasticSearch documents are self-contained.
Dev tasks
run-migrations.sh