latex3 / babel

The multilingual framework to localize LaTeX, LuaLaTeX and XeLaTeX
https://latex3.github.io/babel/
LaTeX Project Public License v1.3c
130 stars 35 forks source link

Bold small caps with Lualatex #92

Closed jbezos closed 1 year ago

jbezos commented 4 years ago

See https://tex.stackexchange.com/questions/558712/bold-small-caps-with-lualatex .

\documentclass{article}

\usepackage[italian]{babel}

% \babelfont{rm}{EB Garamond}           % Works
\babelfont[italian]{rm}{EB Garamond}    % Doesn't work

\begin{document}

Quel \textbf{ramo del \textsc{lago} di Como}, che volge a \textbf{\textsc{mezzogiorno}}

\end{document}

It seems to happen when there is no \babelfont{rm}{...} (ie, a ‘global’ declaration for all languages).

EDIT (Findings). By selecting the main language (which forces the actual definition of the font) just before \init@series@setup, it works, with:

\toks@\expandafter{\init@series@setup}
\edef\init@series@setup{%
  \noexpand\selectlanguage{\bbl@main@language}\the\toks@}
jbezos commented 3 years ago

The problem is related to the new \init@series@setup which triggers a change in the default bf from b to bx if the ‘current font’ is in a harcoded lists of families. And one of these families is the default font lmr (the full list is cmr,cmss,cmtt,lcmss,lcmtt,lmr,lmss,lmtt). If the font when the document starts is another one (eg, CMU Serif or Arial), it works as expected. So, now the question is why LaTeX or fontspec (I'm not sure) is doing this change at this point.

FrankMittelbach commented 3 years ago

So, now the question is why LaTeX or fontspec (I'm not sure) is doing this change at this point.

LaTeX. This happens because for the historical CM based fonts LaTeX always used bx for the series and so a document using them as their document font initializes the series to the traditional default. Otherwise 4 decades of documents would change their appearance if reprocessed

jbezos commented 3 years ago

At last I was able to reproduce this behaviour without babel:

\documentclass{article}

\usepackage{fontspec}

\begin{document}

\renewfontfamily\rmfamily{EB Garamond}
% \def\rmdefault{EBGaramond(0)}   % Wrong if not commented out
\rmfamily

\textbf{\bfseries ramo del \textsc{lago} di Como}

\end{document}

Formerly things like this worked.

FrankMittelbach commented 3 years ago

Sorry, fail to see what is wrong in your example output. If I run it verbatim through luatex (only adding \showoutput) I get

....\hbox(0.0+0.0)x15.0, direction TLT
....\TU/EBGaramond(0)/b/n/10 r
....\TU/EBGaramond(0)/b/n/10 a
....\TU/EBGaramond(0)/b/n/10 m
....\TU/EBGaramond(0)/b/n/10 o
....\glue(\spaceskip) 2.37 plus 1.185 minus 0.79
....\TU/EBGaramond(0)/b/n/10 d
....\kern-0.16 (font)
....\TU/EBGaramond(0)/b/n/10 e
....\TU/EBGaramond(0)/b/n/10 l
....\kern0.0 (italic)
....\glue 2.37 plus 1.185 minus 0.79
....\TU/EBGaramond(0)/b/sc/10 󰆇
....\TU/EBGaramond(0)/b/sc/10 󰅌
....\kern-0.12 (font)
....\TU/EBGaramond(0)/b/sc/10 󰅯
....\TU/EBGaramond(0)/b/sc/10 󰆕
....\kern0.0 (italic)
....\glue(\spaceskip) 2.37 plus 1.185 minus 0.79
....\TU/EBGaramond(0)/b/n/10 d
....\TU/EBGaramond(0)/b/n/10 i
....\glue(\spaceskip) 2.37 plus 1.185 minus 0.79
....\TU/EBGaramond(0)/b/n/10 C
....\TU/EBGaramond(0)/b/n/10 o
....\TU/EBGaramond(0)/b/n/10 m
....\TU/EBGaramond(0)/b/n/10 o

Can't see anything wrong with that and looks what I would expect (even though I don't know why there is \bfseries insid \textbf :-) ) and personally I consider font changes like that in mid document as bad.

Am I missing something or are you using an old format? (or some old fontspec?)

jbezos commented 3 years ago

This is what I'm getting:

ebg-com

ebg-uncom

FrankMittelbach commented 3 years ago

Ok, sorry my fault, I did some experiments and there was a change in my version after all.

But the problem boils down to "renew" something is always dangerous. LaTeX has 3 meta families "rm" "sf" and "tt". In fontspec you set them via \setmainfont etc. The \newfontfamilywas meant to provide additional font family commands but not to alter the 3 meta families. So doing \renewfontfamily\rmfamily overwrites \rmfamily with a simple bit of code for selecting families but not setting up anything needed for the meta families. The kernel def of \rmfamily is

> \rmfamily=robust macro:
->\protect \rmfamily  .

> \rmfamily =\long macro:
->\not@math@alphabet \rmfamily \mathrm \prepare@family@series@update {rm}\rmdef
ault \UseHook {rmfamily}\selectfont .
<argument> \rmfamily  

l.8 \ShowCommand\rmfamily

After the \renewfontfamily that becomes just a simple font switch missing out the font series adjustment for the meta family

\prepare@family@series@update {rm}\rmdefault 

which is responsible to detect \rmdefault changes and adjust things (which is causing your difference).

If you use this new definition of \rmfamily but you do not change \rmdefault then the default is still "lmr" so when \bfseriesis executed it comparses current family (EBGaramond(0)) with lmrand concludes that this is not the main document font so it doesn't use the bold series for "rm" but uses the default bf series from \rmdefault(which is "b")

But if you additionally set \rmdefault to Garamond then it concludes it this is the document rm font and so applies the bold series for rm which is still "bx" hence the difference. So with

 \def\rmdefault{EBGaramond(0)}   % Wrong if not commented out

you finally change the document font but you don't adjust the rest of the machinery (what is done by \setmainfont)

If you also add

\DeclareFontSeriesDefault[rm]{bf}{b}

then your example works, but it is still wrong in the sense that, for example, the hooks that should be run by \rmfamily(and which are needed for Japanese and Chinese) are no longer part of the definition due to the renew overwrite

Basically \rmfamily, \sffamily or \ttfamily should only be changed through \setmainfont and the like but not with \renewfontfamily which is really only there for changing families defined with \newfontfamily --- I think fontspec doc should explicitly state this @wspr !

(Mis)using \renewfontfamily for them my have worked in the past (more or less) because the 3 meta families did less, but any longer and it was always wrong because even back then there has been some special code in them that got dropped by the overwrite.

jbezos commented 3 years ago

In a multilingual document this is a severe limitation. I'll investigate how to overcome it.

FrankMittelbach commented 3 years ago

use \setmainfont instead of \renewfontfamily if you alter \rmfamily ? and similar for \sffamily or \ttfamily

u-fischer commented 3 years ago

I can't test now, but imho using setmainfont all the time would be quite slow and constantly create new nfss font.

FrankMittelbach commented 3 years ago

As far as I can see switching fonts back and forth via \setmainfont ends up with the same EBGaramond(0). And the number of lines in tracingall in \setmainfont (on repeated use) compared to \renewfontfamily is comparible (high as usual in fontspec).

So fontspec is clever enough not to do that (and I think the machinery when using \renewfontfamily is the same so that would then happen there too). But on the whole it would be much more efficient to have the fonts declared in the preamble (for different languages) and when changing using NFSS interfaces instead of a fontspec call which in itself is rather costly.

jbezos commented 3 years ago

In a quick test I just mapped \TU/EBGaramond(0)/bx/sc/10\TU/EBGaramond(0)/b/sc/10 and it worked!

jbezos commented 3 years ago

I don't think the solution is \setmainfont, but to make sure each font uses the ‘correct’ bold identifier, which is not always a family property, but a font property.

FrankMittelbach commented 3 years ago

Am 16.10.20 um 14:45 schrieb Javier Bezos:

In a quick test I just mapped \TU/EBGaramond(0)/bx/sc/10 -> \TU/EBGaramond(0)/b/sc/10 and it worked!

sure that gives you the needed substitution in that case but \rmfamily is still broken if you overwrite its definition. This \renewfontfamily is not really better than using \renewcommand on something and replace it with something not doing the same afterwards. That is why I'm saying it was meant for changing newly defined fontfamilies (done by \newfontfamily)

jbezos commented 3 years ago

I think I've found a bug: EDIT Maybe not. The last name for a font wins, and very likely \textsc is defining \TU/EBGaramond(0)/bx/sc/10 as \TU/EBGaramond(0)/bx/n/10.

\documentclass{article}

\usepackage{fontspec}

\def\pf{ (\expandafter\string\the\font) }

\begin{document}

\renewfontfamily\rmfamily{EB Garamond}
\def\rmdefault{EBGaramond(0)}
\rmfamily

\textbf{\bfseries\pf ramo del \textsc{lago} di Como\pf}

\end{document}

The last \pf shows \TU/EBGaramond(0)/bx/sc/10.

jbezos commented 3 years ago

I don't think this issue is directly related to how \rmfamily is (re)defined, but about how bold is handled. Apparently, both \TU/EBGaramond(0)/b/n/10 and \TU/EBGaramond(0)/bx/n/10 are defined, and so are \TU/EBGaramond(0)/b/it/10 and \TU/EBGaramond(0)/bx/it/10, for example, and also \TU/EBGaramond(0)/b/sc/10, but not \TU/EBGaramond(0)/bx/sc/10. So, bold + italic works, but not bold + smallcaps. I'm not still sure, but it's like fontspec had missed this combination. What do you think, @wspr ?

FrankMittelbach commented 3 years ago

Javier, in my opinion the bug is that you overwrite \rmfamily with something totally different and that is not supported as I explained. As we say in Germany "aus Falschem folgt Beliebiges": add another \pf inside \textsc and you will see that it is bx/sc there too.

A simpler example

\documentclass{article}

%\usepackage{fontspec}

\def\pf{ (\expandafter\string\the\font) \expandafter\meaning\the\font}

\begin{document}

A \pf

\fontshape{notfound}\selectfont

B \pf

\fontseries{m}\selectfont

C \pf
\end{document}

But in fact as that font is not found normal Garmond is used and that changes the label under which TeX reports the external font ... this is a nuisance and sometimes really confusing but nothing I ever found being reliably fixable.

jbezos commented 3 years ago

Yes, I edited my post. And actually when I realized why this name was shown, then also realized \TU/EBGaramond(0)/bx/sc/10 was not defined. See the next post ( https://github.com/latex3/babel/issues/92#issuecomment-710073291 ).

FrankMittelbach commented 3 years ago

I don't think this issue is directly related to how \rmfamily is (re)defined, but about how bold is handled. Apparently, both \TU/EBGaramond(0)/b/n/10 and \TU/EBGaramond(0)/bx/n/10 are defined, and so are \TU/EBGaramond(0)/b/it/10 and \TU/EBGaramond(0)/bx/it/10, for example, and also \TU/EBGaramond(0)/b/sc/10, but not \TU/EBGaramond(0)/bx/sc/10. So, bold + italic works, but not bold + smallcaps. I'm not still sure, but it's like fontspec had missed this combination. What do you think, @wspr ?

In the early days of NFSS font substitutions have been sensible approach as they where only necessary for b/bx in a few fonts. But with the modern fonts with many more faces is rather getting out of hand and you are likely to overwrite an existing fontface with some fake substitution. Which is why I think that fontspec's setup is already now doing too much there.

This is why high-level series commands now look at what meta family they are in (rm/sf/tt) and then pick up the appropriate bold or "md" for that family. This allows to use different bolds next to each other if that is appropriate for the fonts used (e.g. combing different bold levels with each other). None of that is possible with substitutions .

FrankMittelbach commented 3 years ago

I don't think the solution is \setmainfont, but to make sure each font uses the ‘correct’ bold identifier, which is not always a family property, but a font property.

again: part of the correct solution is not to break \rmfamily by replacing it with something that stops LaTeX from using the correct bold identifier and I disagree that this is a fontface and not a font family identifier --- the substitutions are there if the family has a strange setup so that the bold changes within the family, e.g., if everything is using "bx" in the family except for the sc shape that only exists in "b" or or a similar strange setup.

jbezos commented 3 years ago

This is why high-level series commands now look at what meta family they are in (rm/sf/tt) and then pick up the appropriate bold or "md" for that family.

A multilingual document has several ‘main’ fonts, and something like \normalfont must select the correct font for the current language, as for example the default one in German, Amiri in Arabic and FreeSerif in Thai. So, \rmfamily cannot be a single font. We must deal somehow with conceptually single families based on different ‘physical’ fonts.

That's basically what babel is trying to do, and pretty well, I think 😉, because in most cases there are no problems. For the usual combinations md, md+it, bf, bf+it currently works, and also for some others. But bf+sc sometimes fails (not always - if there is a generic \babelfont{rm}{...} it works as expected in all test I've done).

There are alternatives, of course. An option would be to convert each meta family in a sort of ‘font collection’ inside LaTeX itself (I'm not sure how to do it), or with the help of combo fonts (in lua, but not in xe), or any other thing.

By the way, babel doesn't touch \rmfamily. I used it in the minimal non-babel example just for convenience. Sorry for not being clear about this point, but I didn't think the discussion would deviate towards this macro.

jbezos commented 3 years ago

I disagree that this is a fontface and not a font family identifier

Well, something we disagree on 🙂.

FrankMittelbach commented 3 years ago

A multilingual document has several ‘main’ fonts, and something like \normalfont must select the correct font for the current language, as for example the default one in German, Amiri in Arabic and FreeSerif in Thai. So, \rmfamily cannot be a single font. We must deal somehow with conceptually single families based on different ‘physical’ fonts.

I never said that it should be a single font. \rmfamily is not a font it is selecting a font family under certain conditions and I fully understand that in a multilingual setting it needs to be able to change with language. What I'm objecting to is that overwriting the definition of \rmfamily (which is code to select a fontfamily and arrange for its use) by somethign which is just a font call (which is what \newfontfamily provides).

For example in Japanese environments \rmfamily automatically sets up beside the "roman" latin font a matching Kanji

By the way, babel doesn't touch \rmfamily. I used it in the minimal non-babel example just for convenience. Sorry for not being clear about this point, but I didn't think the discussion would deviate towards this macro.

Then it would be interesting to see what babel does becauseif \rmfamily is not touched and all you do is alter \rmdefault back and forth then all you need to do is in the newer NFSS to do a little bit more namely

but that only works if \rmfamily has its code intact.

A possible alternative is to work similar to the Japanese and add some additional code to \rmfamily through the hook mechanism that does special setups based on current language.

FrankMittelbach commented 3 years ago

I disagree that this is a fontface and not a font family identifier

Well, something we disagree on 🙂.

well show me a font family where this is not true (there are some but not many and for those substitution is the right way).

For example in the earlier example the problem only arises because Garamond really only has "b" and there should never be an ask for "bx" and it only works most of the time because fontspec for historical reasons provides such bx->b substitutions

But what happens if you use Noto semibold?

jbezos commented 3 years ago
  • but also alter the "bf" series value to go with that new \rmdefault [snip] A possible alternative is to work similar to the Japanese and add some additional code to \rmfamily through the hook mechanism that does special setups based on current language.

I'm doing some testing with those ideas and the results are promising (actually, it works!). It seems this is what I was looking for, particularly the first point, which is the key point, namely, now the series must be (re)set when the fonts are switched (but I have to study what happens when the user changes the default setup). Thank you. 🎉🚀

FrankMittelbach commented 3 years ago

Javier

I'm doing some testing with those ideas and the results are promising (actually, it works!). It seems this is what I was looking for, particularly the first point, which is the key point, namely, now the series must be (re)set when the fonts are switched (but I have to study what happens when the user changes the default setup). Thank you. 🎉🚀

The idea is that the extended setup (including the hooks added in various places) should support multilingual typesetting or multi-script typesetting. Might be that there is still something missing and if so one needs to check what and how best to accommodate (or whether a slightly different at your end fixes it).

So take a look (also perhaps what the Japanese have done) and what the hooks can do for your use cases or if there is something missing.

If you end up using a lot of \DeclareFontSeriesDefault[rm]{bf}{b} type of calls then perhaps that need a speedy interface not involving optional argument parsing. But I suspect in the end we want a data structure (table) with metafamily/language -> NFSS attributes (and I guess in some way you already have that) and then have a fast way to query that table and set up NFSS fast and easy.

jbezos commented 3 years ago

After many afternoons and hours trying to fix this issue, reading several times ‘source2e’ trying to understand the logic behind the new way to deal with bfseries (10 pages, with thicks based on \@empty's inside macros), and about 15 attempts, I've decided to give up 😖.

Any help will be greatly appreciated. In the meanwhile, I’ll document it as ‘known issue’, with perhaps some workaround.

FrankMittelbach commented 3 years ago

ok I'll give it a try at some point (no promise though)

jbezos commented 3 years ago

Thank you. Maybe I'm too perfectionist and I'm trying to do too much, in an attempt to consider every possible case.

FrankMittelbach commented 3 years ago

Try this (but adjust maybe -- see comment)

% !TEX program = lualatex

% some tracing to better understand what bebel does ...
%% ----------------
\makeatletter
\def\showit#1{\string#1 = \expandafter\detokenize\expandafter{#1}!^^J}
\AddToHook{expand@font@defaults}{%
  \typeout{%
    ======= expand@font@defaults =====  \on@line^^J%
    \showit\rmdef@ult
    \showit\bfseries@rm
    \showit\bfdef@ult
    \showit\mddef@ult
}}

\AddToHook{bfseries}{%
  \typeout{%
    ======= bfseries =====  \on@line^^J%
    \showit\f@series
}}
\makeatother
%% ----------------

\documentclass{article}

\usepackage[italian]{babel}
%\usepackage{trace,structuredlog}

% instead of changing \rmfamily to load a font and then change it back
% use tmp cs to load the font (all we want to know is the internal name it gets)
% Finally adjust \bfseries@xx base on type this could be done better as somewhere the
% first mandatory arg to \babelfont is known so could just be 
%
%  \expandafter\gdef\csname bfseries@<something>\endcsname{b}
%

\makeatletter
\def\bbl@fontspec@set#1#2#3#4{% eg \bbl@rmdflt@lang fnt-opt fnt-nme \xxfamily
  \let\bbl@tempe\bbl@mapselect
  \let\bbl@mapselect\relax
%% ----------------
%%  \let\bbl@temp@fam#4%       eg, '\rmfamily', to be restored below
%%  \let#4\@empty      %       Make sure \renewfontfamily is valid
%%
  \let\bbl@temp@fam\@empty%     Make sure \renewfontfamily is valid 
%%  
%% ----------------
  \bbl@exp{%
    \let\\\bbl@temp@pfam\<\bbl@stripslash#4\space>% eg, '\rmfamily '
    \<keys_if_exist:nnF>{fontspec-opentype}{Script/\bbl@cl{sname}}%
      {\\\newfontscript{\bbl@cl{sname}}{\bbl@cl{sotf}}}%
    \<keys_if_exist:nnF>{fontspec-opentype}{Language/\bbl@cl{lname}}%
      {\\\newfontlanguage{\bbl@cl{lname}}{\bbl@cl{lotf}}}%
%% ----------------
%%    \\\renewfontfamily\\#4%
%%
    \\\renewfontfamily\\\bbl@temp@fam
%%    
%% ----------------
      [\bbl@cs{lsys@\languagename},#2]}{#3}% ie \bbl@exp{..}{#3}
  \begingroup
     \bbl@temp@fam
     \xdef#1{\f@family}%     eg, \bbl@rmdflt@lang{FreeSerif(0)}
  \endgroup
%%  \let#4\bbl@temp@fam
%% ----------------
  \ifx#4\rmfamily
    \gdef\bfseries@rm{b}%   babel seems to run all this in a group
  \else\ifx#4\sffamily
    \gdef\bfseries@sf{b}%
  \else\ifx#4\ttfamily
    \gdef\bfseries@sf{b}%
  \else
    \ERRORshouldnothappen
  \fi
  \fi
  \fi
%% ----------------
  \bbl@exp{\let\<\bbl@stripslash#4\space>}\bbl@temp@pfam
  \let\bbl@mapselect\bbl@tempe}%
\makeatother

% \babelfont{rm}{EB Garamond}           % Works
\babelfont[italian]{rm}{EB Garamond}    % Doesn't work

\begin{document}

  Quel \textbf{ramo del
    \textsc{Small Caps} di Como},
    che volge \textbf{a \textsc{mezzogiorno}}

\end{document}

However, in some sense hardwiring "b" or "\bfdefault" is insufficient.

You can, for example, do

\setsansfont{Noto Sans}
   [ FontFace = {sb}{n}{Noto Sans SemiBold} ]
\DeclareFontSeriesDefault[sf]{bf}{sb}  % semibold

which means \sffamily will use the semibold sans (while \rmfamily will still use normal bold). For \babelfont there is no way to say that Italian sans should use semibold it is now hardwired to "b"

However, I don't think you should extend the syntax right now. This needs some careful thought what is actually best. @wspr 's fontspec is not optimal here either given the new NFSS setup possibilities I would say.

jbezos commented 3 years ago

However, in some sense hardwiring "b" or "\bfdefault" is insufficient.

Definitely.

However, I don't think you should extend the syntax right now.

Agreed.

FrankMittelbach commented 3 years ago

However, in some sense hardwiring "b" or "\bfdefault" is insufficient.

Definitely.

that's what fontspec does too and it dates back from the dates where anything else was not sensible. So for now it is I think the appropriate solution, but across @wspr, you and myself we should circle back on that to see how to make this work best for multi-lingual documents with more complex font setups at some point.