t-tk / upmendex-package

Source/Document distribution of upmendex --- multilingual index processor
Other
5 stars 1 forks source link

In Hungarian documents, the initial words ö, ő, ü, ű are wrongly grouped #10

Closed hair-splitter closed 4 months ago

hair-splitter commented 4 months ago

The following example does not work properly:

doc.tex:

\documentclass{article}
\usepackage{fontspec}
\usepackage[hungarian]{babel}
\usepackage[noautomatic]{imakeidx}
\makeindex

\begin{document}
~
\index{oxxx}\index{Oxxx}
\index{óxxx}\index{Óxxx}
\index{oyyy}\index{Oyyy}
\index{óyyy}\index{Óyyy}

\index{öxxx}\index{Öxxx}
\index{őxxx}\index{Őxxx}
\index{öyyy}\index{Öyyy}
\index{őyyy}\index{Őyyy}

\index{uxxx}\index{Uxxx}
\index{úxxx}\index{Úxxx}
\index{uyyy}\index{Uyyy}
\index{úyyy}\index{Úyyy}

\index{üxxx}\index{Üxxx}
\index{űxxx}\index{Űxxx}
\index{üyyy}\index{Üyyy}
\index{űyyy}\index{Űyyy}

\printindex
\end{document}

user.ist:

icu_locale "hu"
lethead_flag 1
lethead_prefix "{\\bfseries "
lethead_suffix "}\\nopagebreak\n"

The compiler method:

lualatex doc.tex
upmendex -s user.ist doc.idx
lualatex doc.tex

The result:

doc

The initial words ö, ő, ü, ű are wrongly grouped.

The expected result would be:

doc-expected

Is this a mistake, or am I doing something wrong?

I use TeX Live 2024 on Windows 11.

sgolovan commented 4 months ago

I can suggest the following workaround for grouping: add the strength:primary to the collation attributes list in the user.ist:

icu_attributes "strength:primary"

This attribute alters the final order by putting upper case letters before lower case ones. And another ICU attribute case-first:lower-first doesn't seem to help.

hair-splitter commented 4 months ago

This is fantastic. This gives an even better result than what I originally wanted. Thank you very much for your help!

t-tk commented 4 months ago

Thank you for your report. I think I could fix the issue in Hungarian by https://github.com/t-tk/upmendex-package/commit/d95a719859647b5fdc7c279f95810f86129e60dc and I committed TeX Live svn r71719 .

hair-splitter commented 4 months ago

Thank you very much. I updated my TeX Live system today, but the upmendex hasn't changed, so I haven't been able to try the new version yet.

t-tk commented 4 months ago

Official release of TeX Live is scheduled once a year and next will be available in March 2025. I do not know that which environment is earlier for your test.

hair-splitter commented 4 months ago

In addition to the annual reinstallation of TeX Live, I also use the next mid-year update on a monthly basis, which usually contains a lot of new package updates: tlmgr update -self -all -reinstall-forcibly-removed I thought that upmendex used to be updated during the year, like any other package. (For example, dvipdfm.exe was last updated on 2024-07-01 in TeX Live 2024.)