reutenauer / polyglossia

An alternative to Babel for XeLaTeX and LuaLaTeX
http://www.ctan.org/pkg/polyglossia
MIT License
187 stars 52 forks source link

improve error message for missing non-latin-fonts #571

Open u-fischer opened 1 year ago

u-fischer commented 1 year ago

When using a language with a non-latin script one normally has to setup language font. This often must be done for all three families, (roman, sansserif and typewriter). But the error message of polyglossia if this is missing for a family are quite misleading:

\documentclass{article}

\usepackage{polyglossia} 
\setmainlanguage{english}
\setotherlanguage{arabic}
\newfontfamily\arabicfont[Script=Arabic]{Amiri}

%\newfontfamily\arabicfontsf[Script=Arabic]{Amiri}
\begin{document}
text \textarabic{رمانية}

\sffamily \textarabic{رمانية}
\end{document}

gives

! Package polyglossia Error: The current latin font Amiri(0) does not contain the "Arabic" script!
(polyglossia)                Please define \arabicfont with \newfontfamily command.

This is quite confusing for users, see e.g. https://tex.stackexchange.com/a/665027/2388 and https://tex.stackexchange.com/q/665326/2388.

jspitz commented 1 year ago

Better now?

Udi-Fogiel commented 7 months ago

I think it still can be improved. Consider the following

\documentclass{article}
\usepackage{polyglossia}
\setmainlanguage{arabic}
\setotherlanguage{hebrew}
\setmainfont{Amiri}
\begin{document}
Test

\selectlanguage{hebrew}
Test
\end{document}

Which gives the following error

! Package polyglossia Error: The current latin roman font does not contain the 
"Hebrew" script!
(polyglossia)                Please define \hebrewfont with \newfontfamily command.

I'm not sure why we need to distinct only between latin and non-latin scripts in this manner, or if we should give an error or a warning, but in this example Amiri is definitely not a latin font.

Udi-Fogiel commented 7 months ago

I don't understand a couple of things in \xpg@fontsetup@nonlatin.

The Test https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L819 is always false, since the comparison is of the strings themselves, not their value. But why in the first place should we care if the script tag and the language tag are identical?

https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L820 why \rmlatin is used in the setup of non latin font?

compare the following test from the non latin setup https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L821 and the one from the latin setup https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L804 why in the latin case we check if the command \<langname>font but in the non latin case, we check the existence of \<script>font?

jspitz commented 7 months ago

The Test

https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L819

is always false, since the comparison is of the strings themselves, not their value. But why in the first place should we care if the script tag and the language tag are identical?

I suppose because we have already checked for the existence of \<langname>font in the embracing test (in the "false" chain of which we are now), so this is probably just a way to leave the condition early (but it's old code so I can only guess).

https://github.com/reutenauer/polyglossia/blob/55de651eeba716611736fa6cc2c17607c93772b6/tex/polyglossia.sty#L820

why \rmlatin is used in the setup of non latin font?

As a fallback since no other font is being defined?

why in the latin case we check if the command \<langname>font but in the non latin case, we check the existence of \<script>font?

Note that in the second case, we check for both, \<langname>font and \<script>font (if the former is not defined). I think it does not make sense to check for \<script>font in the latin case, as probably nobody will use \latinfont to refer to the Latin script (and not the Latin language).

jspitz commented 7 months ago

I think it still can be improved. Consider the following

\documentclass{article}
\usepackage{polyglossia}
\setmainlanguage{arabic}
\setotherlanguage{hebrew}
\setmainfont{Amiri}
\begin{document}
Test

\selectlanguage{hebrew}
Test
\end{document}

Which gives the following error

! Package polyglossia Error: The current latin roman font does not contain the 
"Hebrew" script!
(polyglossia)                Please define \hebrewfont with \newfontfamily command.

AFAICS the more appropriate wording would be

The current main roman font does not contain the 
"Hebrew" script!

No?

jspitz commented 7 months ago

Please try again after c72e957283b030 The error message now reads:

! Package polyglossia Error: The current main roman font, Amiri, does not conta
in the "Hebrew" script!
(polyglossia)                Please define \hebrewfont with \newfontfamily comm
and.
jspitz commented 7 months ago

The latest attempt didn't report correctly for non-rm fonts. Fixed at ba3b2fdf2fdbe

Udi-Fogiel commented 6 months ago

The message is much better now, although I'm not sure this is the best approach. IIRC regular expressions are really slow in latex.

jspitz commented 6 months ago

The regex is only used if this particular error occurs, so I think this is bearable (but feel free to improve of course)

Udi-Fogiel commented 1 month ago

The message can be improved with LuaTeX

\documentclass{article}

\usepackage{polyglossia}
\setmainlanguage{arabic}
\setmainfont{DavidCLM}

\begin{document}
Test
\end{document

! Package polyglossia Error: The current main roman font, name, does not
(polyglossia)                contain the "Arabic" script!
(polyglossia)                Please define \arabicfont with \newfontfamily
(polyglossia)                command
}

Let's do that after the next release though