commonsearch / cosr-back

Backend of Common Search. Analyses webpages and sends them to the index.
https://about.commonsearch.org
Apache License 2.0
123 stars 24 forks source link

Optimize Elasticsearch indexing #3

Open sylvinus opened 8 years ago

sylvinus commented 8 years ago

It's not a bottleneck at this point, but it could clearly be improved.

A first step could be to avoid using the Python _bulk helper class and use ujson to dump results directly.

Batch sizes should also be added as config parameters so that they can be optimized by ops at index time.