MukurtuCMS / mukurtucms

GNU General Public License v2.0
83 stars 36 forks source link

Make dictionary search work the same for alternate characters (ie: āi shows results containing ai and vice versa) #100

Open KelleyCoda opened 7 years ago

KelleyCoda commented 7 years ago

This issue comes from Te Reo o Taranaki in New Zealand:

Can we refine search to include macronised vowels in results? E.g ākuanei will be found if user searches for akuanei & vice versa?

She adds:

Varying levels of macron application amongst students - need this search & find rule at a site level too.

When learning a non-english language it is difficult sometimes to remember which alternative characters to use and I think search would be improved so early learners can find words easier.

KelleyCoda commented 7 years ago

Hi there, just hoping this caught your attention! It would mean a lot to Te Reo o Taranaki to hear whether or not this is something you plan to make possible. Do you know if it's possible to change on a single site?

taylor-steve commented 7 years ago

Hi Kelley,

That is something we'd like to add, but isn't scheduled for this next release. We have plans to further improve unicode handling throughout the site, this would probably be included as part of that effort.

For a one off, I'd consider trying one of the transliteration/romanization modules that are available. You might need to create hidden duplicate fields (e.g., 'title_romanized') and index those or search API aggregate fields that are the romanized version of the field. I haven't had a chance to really evaluate what is out there right now, I don't know how well any of them dovetail in with our current search API setup.

Thanks.