Change sorting behavior of titles with diacritics

zotero / styles-repo

Zotero styles page

http://www.zotero.org/styles

14 stars 12 forks source link

Change sorting behavior of titles with diacritics #21

Closed rmzelle closed 8 years ago

rmzelle commented 8 years ago

The current sort order seems a little confusing. I assume people assume "Česká" to sort under "C" and "Российский" under "R":

tnajdek commented 8 years ago

I've implemented a partial fix for this (in a PR) and it will correctly put Česká under C. However "Российский" begins with Cyrillic Er which is sorted after all latin letters. I'd argue this is a correct behaviour - alternative is to do transliteration of cyrillic into latin for sorting purposes.

rmzelle commented 8 years ago

However "Российский" begins with Cyrillic Er which is sorted after all latin letters. I'd argue this is a correct behaviour - alternative is to do transliteration of cyrillic into latin for sorting purposes.

@avram, can I ask your opinion on what constitutes the best sorting behavior as our Slavic expert?

(@tnajdek, I think ignoring diacritics for sorting is the more important issue, so thanks for doing that already!)

avram commented 8 years ago

This is a hairy area -- there are standard collations (http://www.unicode.org/reports/tr10/) that cover all of this and which we should be able to lean on directly from PHP (http://php.net/manual/en/collator.create.php), My concern with transliterating the Cyrillic is that it might hurt searchability for Russian users and it might force them to have a transliterated style name in the otherwise Russian Zotero UI, which would be annoying.

rmzelle commented 8 years ago

My concern with transliterating the Cyrillic is that it might hurt searchability for Russian users and it might force them to have a transliterated style name in the otherwise Russian Zotero UI

As I understand it, the proposal is to just use the transliterated style title for sorting. The name that is displayed (and searched against) would stay Cyrillic.

Or is it accepted practice to sort Cyrillic characters after Latin, in which case we can just leave things as is?

avram commented 8 years ago

It is accepted to have Cyrillic sort after Latin letters, and I don't see that as problematic.

rmzelle commented 8 years ago

Okay, thanks!

tnajdek commented 8 years ago

Transliteration would only be done for sorting purposes, it wouldn't affect how the style title is displayed. That being said I agree with @avram, it seems ot be accepted practice to sort Cyrillic after Latin.