owid / owid-grapher

A platform for creating interactive data visualizations
https://ourworldindata.org
MIT License
1.35k stars 227 forks source link

fix(search): fix search highlighting of entities containing stop words #3718

Closed marcelgerber closed 3 days ago

marcelgerber commented 1 week ago

So, earlier today when we were talking about search, I assured Lars that "yes, we can now match all kinds of entities in search queries". I realized shortly thereafter that this wasn't the case.

The problem is that for countries like "Trinidad and Tobago" or "Saint Vincent and the Grenadines", Algolia would remove stop words from the highlighted results, and then the matching based on highlighted results wouldn't include them.

I now fixed this by also running the "dumber" extractRegionNamesFromSearchQuery, for a first pass of matching country/region names that's purely based on the search query. Only after that will it run the other matching logic (in order to also catch non-region matches, like Salmon (farmed) or also Africa (UN)). This now means that non-country entities that contain a stop word will not be matched - something like Salmon and tuna, maybe - but I think this is very much acceptable.

There's a big code comment now explaining the rationale for all this logic, hope that one mostly clears it up!

Before / After

CleanShot 2024-06-18 at 18 04 41

Link

owidbot commented 1 week ago
Quick links (staging server): Site Admin Wizard

Login: ssh owid@staging-site-search-country-matching-stop-words

SVG tester: Number of differences (default views): 0 ✅ Number of differences (all views): 0 ✅

Edited: 2024-06-18 17:13:59 UTC Execution time: 1.13 seconds