wellcomecollection / platform-infrastructure

:building_construction: Infrastructure for the Wellcome Digital Platform
MIT License
24 stars 8 forks source link

Investigate 5xx caused by long search queries #435

Closed jcateswellcome closed 6 months ago

jcateswellcome commented 6 months ago

Periodically we receive platform alerts for 5xx caused by searches for long titles. Before we decide on how we can minimise or stop this issue, it would be useful to have some more insight into current usage patterns and limitations.

More background on this can be found here, with examples of searches that trigger errors: https://wellcome.slack.com/archives/C3TQSF63C/p1713776447397219

jcateswellcome commented 6 months ago

Relate to epic: https://www.notion.so/wellcometrust/Search-relevance-improvements-2ea0f44a0e6e44b38fd3ec6e587b23c4?pvs=4

pollecuttn commented 6 months ago

Investigate not fully analysing queries past a certain number of tokens (number tbd)

Ignore questions above for now.

jamieparkinson commented 6 months ago

I think that this will probably involve configuring a https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-limit-token-count-tokenfilter.html to be used, quite possibly only at search time https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html

kenoir commented 6 months ago

Closing in favour of https://github.com/wellcomecollection/platform/issues/5739