glacials / splits-io

a speedrunning data store and analysis engine
https://splits.io
GNU Affero General Public License v3.0
133 stars 27 forks source link

Search behaves weirdly #486

Open glacials opened 5 years ago

glacials commented 5 years ago

Many small search terms like "mar", "tro", "sunsh" yield 0 results, but adding more characters ("mari", "tron", "sunshi") gives games back. It can be misleading to show 0 results in these cases, as people might see 0 results in the autocomplete menu and just stop typing and assume the game doesn't exist on Splits I/O. At best we should show the results, but at minimum we should say "keep typing" or something.

Also, "mari" yields Paper Mario, Mario Golf, and Mario Party but nothing else. "mario" yields all mario games. This feels weird, unexpected, and hard to understand. This might be unrelated to the above issue.

BatedUrGonnaDie commented 5 years ago

Both of these are most likely due to the searches using a trigram search to check for similarities instead of just seeing if that string exists in the games name. In order for trigram to work enough of the name needs to be present in the entered search in order to meet a certain threshold to be included in the results. The shorter the games name, the less you have to enter for the threshold to be met which is why mari would yield some games but not all mario games that might be longer than the ones you get back.

glacials commented 5 years ago

Thanks for the explanation, that makes sense. What benefits does trigram give us vs. normal string comparison? Is it possible to have the best of both worlds?

BatedUrGonnaDie commented 5 years ago

The main one would be trigram allows you to make typos in your searches and still find the results that you are looking for.

We should be able to leverage it all through pg_search by using a t_search with the prefix option set to true. This will match any part of it instead of just whole words.

https://github.com/Casecommons/pg_search#prefix-postgresql-84-and-newer-only A little below this is where it explains trigram as well, and then searching using multiple styles below that.

glacials commented 5 years ago

Another report of weirdness, this time the reverse of above: typing angelaclaws does not return the user angelaclaws (exact string match), but typing angelaclaw or fewer characters does.