Open jrust opened 7 years ago
I'm have the same issue. I think to speed up we need to change two thinks : 1) drop the index before redexing to delete ghost element and because ES and Solr try to update the document and it's slower than add a document. 2) add a flush method as you suggest
I think we have the same issue in Solr.
I do not know well this part of the code but you can try to enable batch loading.
A couple things I've noticed:
After upgrading to ES5 I found that re-indexing a relatively small graph got significantly slower. Traced it down to needing to set index.translog.durability to
async
. That speeds it up, but it does mean that there's some loss in reliability if there's a problem while reindexing. The ES recommendations on speeding up reindexing suggest indexing several documents at once. Currently the ElasticSearch reindex code does use the_bulk
endpoint, but it is sending a separate update index request for each vertex. Has there been discussion or plans to to cut down on the number of requests by using this method during reindexing?