Semantic-Org / Semantic-UI-React

The official Semantic-UI-React integration
https://react.semantic-ui.com
MIT License
13.22k stars 4.05k forks source link

Dropdown: remove diacritics on filter also converts national letters #2163

Closed bastaware closed 7 years ago

bastaware commented 7 years ago

✖ CSS ISSUES → Post on https://github.com/Semantic-Org/Semantic-UI

✖ USAGE QUESTIONS → Use these dedicated resources: Docs - http://react.semantic-ui.com Chat - https://gitter.im/Semantic-Org/Semantic-UI-React SO - http://stackoverflow.com/search?q=semantic-ui-react

✔ BUGS → This form is required: https://github.com/Semantic-Org/Semantic-UI-React/pull/2021 introduced a filter that removes diacritics on the options in dropdowns, this has an unfortunate side effect as it also converts national characters (ex. ø = o). Scandianvian users would never expect to try searching using o instead of ø - so it breaks the feature.

I suggest that deburr is made optional - and is false by default.

Steps

Create a dropdown that contains the letter ø in one of it's options Search for 'ø' (same goes for æåäö etc)

Expected Result

List the option that contains ø

Actual Result

Nothing is listed

Version

0.73.1

Testcase

[Fork, update, and replace this pen to show the bug]: http://codepen.io/levithomason/pen/ZpBaJX

levithomason commented 7 years ago

I'm in favor of making this opt-in and defaulting to false for better international support.

patrikmolsson commented 7 years ago

@levithomason Can we not instead deburr the input string as well? I.e. when writing ö it gets deburr:ed to o.

karlludwigweise commented 7 years ago

@patrikmolsson This is not an option, as it will result in weird search results. Example:

You have a Dropdown with brand names. Options include "Søren" and "Sören". If you search for "Sö" you will also be offered "Søren".

patrikmolsson commented 7 years ago

@karlludwigweise In my opinion that would actually be the expected search result...

For example, I'm Swedish and if I want to find "Søren" I would have written "Sö", since I do not have ø on my keyboard.

Let me do a brief check how other search engines handle this :)

patrikmolsson commented 7 years ago

As I understood from this article, it seems like it is disabled by default in ElasticSearch, Whoosh, and Solr, as suggested earlier in this post.

Although the author of that article do suggest this in his implementation: "This means that, for example, "café" & "māori" are treated as "cafe" and "maori" respectively, and that searches for either the accented or non-accented versions will both turn up the same results."

In order to make this library easier for the developers, I would therefore stick to that we deburr both input string and the options, and not have it opt-in.

karlludwigweise commented 7 years ago

Keep in mind, that we are now talking about a search field, that sends a query to ElasticSearch. It's a Dropdown with distinct values.

I do agree, that it's great for you to find both results, as you may have a hard time typing ø quickly. In my opinion it should just not be the international standard behaviour. It should be something you have to make a decision for as a developer/product-owner.

Making it the default will be like black-box magic. A possible pain for users and developers.

I do expect search engines to evaluate, if I could have meant "Søren", when typing "Soren". I might not expect that for a searchable dropdown; that simply depends on the application you are in.

patrikmolsson commented 7 years ago

@karlludwigweise I agree with you that it might cause some confusion and pain.

Maybe this is two different issues:

  1. Should the deburr be opt-in, and
  2. If deburr is used, should we deburr both input and options (i.e. both inputs "cafe" and "café" renders "cafe" and "café")

Does that make sense?

karlludwigweise commented 7 years ago

They are. And I would vote a YES for both... ;)

patrikmolsson commented 7 years ago

Haha, I would vote yes for the 2nd issue, but I'm still a bit unsure about the first (maybe it should be opt-out instead), although that issue is not that important to me. :)

bastaware commented 7 years ago

To force deburr on input would really be weird in scandinavia (And other places as well) ø is a completely different letter than o. It can't be compared with é and e, which is the same letter.

Wikipedia: The Scandinavian languages, by contrast, treat the characters with diacritics ä, ö and å as new and separate letters of the alphabet, and sort them after z. Usually ä is sorted as equal to æ (ash) and ö is sorted as equal to ø (o-slash). Also, aa, when used as an alternative spelling to å, is sorted as such. Other letters modified by diacritics are treated as variants of the underlying letter, with the exception that ü is frequently sorted as y.

levithomason commented 7 years ago

Really appreciate the care in the conversation here. I feel the proper decision is to respect internationalization and keep diacritics by default.

I'd imagine opting into deburring would apply to all aspects of the Dropdown. I can imagine the bug reports already if someone opts in to deburr and it isn't applied to some aspect of behavior. This is also easier to understand, it is either on or off. It is never on except/unless/some condition...

The minimal corner cases are covered as users can ultimately pass their own search function and map their options as necessary.

Temporary Workaround

I usually list workarounds to bugs, but it looks like I didn't have time on the first shot. You can pass your own search function and opt to not _.deburr:

const handleSearch = (options, query) => {
  const re = new RegExp(_.escapeRegExp(query), 'i')
  return options.filter(opt => re.test(opt.text))
}

<Dropdown search={handleSearch} selection options={options} />

It looks like we don't have an example documented for this, PR for that also very welcome 😊.

patrikmolsson commented 7 years ago

@levithomason I can help out with this!

Just to be clear, would the PR include:

levithomason commented 7 years ago

✅ Make deburr opt-in (should we use deburr, or maybe fuzzy as prop?) ✅ When having deburr, the input string should also be deburred. ✅ Add proper documentation ✅ Add example

If you'd like to include two examples, one for using a custom search function as I've shown and a second for this new opt-in deburr functionality, that'd be superb.