Open Ardiea opened 2 years ago
One option to consider is deploying an "ingest node" to handle the heavy lifting of mass re-index events so that search performance is not negatively impacted.
Is this related to https://sentry.io/organizations/mit-office-of-digital-learning/issues/3532770150/ ?
There was another batch of error messages and alerts on Feb 13th.
also we can set the number of replicas on our indices to 3, which is the min number of nodes we ever have in our clusters. That will enable the server that receives the request to service the request, rather than having to route it to another node 33% of the time, adding another transport + not efficiently using the resources we pay for. I think we can do that here: https://github.com/mitodl/open-discussions/blob/master/search/indexing_api.py#L312 https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#dynamic-index-settings
https://discuss.elastic.co/t/connection-pooling-in-python/189725/6 I think we can try setting that in this dict here https://github.com/mitodl/open-discussions/blob/6e7a1081eee6f8515afc29ed230148e2741dd5f7/search/connection.py#L23-L30 Based on those statements, it is either deciding that the pool will be 10 or 30 depending on wether we are ‘sniffing’ the number of nodes.
A reindex event from OCW / Open Discussions can have a serious negative impact on the availability of OCW search. How can we reduce that impact?