oracc / oracc-search-front-end

1 stars 1 forks source link

Reorder search results #6

Open ageorgou opened 6 years ago

ageorgou commented 6 years ago

From the meeting on 1 May:

Sort in search table to be changed so it displays hyphenated entried last. Also order of s to be: s, s., s^ (similar for t-variants).

Essentially:

The last point could be clarified a bit. For example, does AB-C come after ADE? In other words, does any "-" make the word appear at the end, or is it just a character sorted after z?

ageorgou commented 6 years ago

Some notes from looking into how we can do this without coding from scratch:

Failing this, we can just build the string comparison from scratch based on our desired ordering.

ageorgou commented 6 years ago

We need to check which non-ASCII characters we want to support; the Oracc docs include more characters than the one mentioned above.

ageorgou commented 6 years ago

On the back-end side, the ICU plugin for ElasticSearch may be of interest.

ageorgou commented 6 years ago

Putting this here because it's surprisingly hard to search for: The Unicode default collation chart (DUCET - Default Unicode Collation Element Table): http://unicode.org/charts/collation/. This is the sorting currently chosen in the backend.

tim-band commented 3 weeks ago

So, what do we actually want now? We had plain ASCII searching, which was bad, now we're back to English collation. Perhaps we want to adjust it somewhat? Or maybe it's fine?