latex3 / babel

The babel system for LaTeX, LuaLaTeX and XeLaTeX
LaTeX Project Public License v1.3c
123 stars 34 forks source link

Hyphenation error with Tibetan and lualatex #261

Closed frederik-elwert closed 9 months ago

frederik-elwert commented 10 months ago

I have a multilingual document that contains some Tibetan passages. When compiling with lualatex, this document breaks with a fatal error. The log hints at a babel issue.

Interestingly, the document compiles fine with lualatex on my Ubuntu 22.04 (which contains TeX Live 2022/dev/Debian), but it breaks in a Ubuntu 22.04 docker container which installs TeX Live 2023 via tlmgr. So the issue might have been introduced between these versions.

Also, the document compiles fine with xelatex under both versions. (Although the line does not actually break nicely, but that’s a different issue.)

The issue seems to be related to the first space, because if I remove that, it also compiles under TeX Live 2023. The second space seems to have no consequences.

Minimal example:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{Noto Serif}
\setsansfont{Noto Sans}
\usepackage{babel}
\babelprovide[main,import]{english}
\babelfont{rm}[Language=Default]{Noto Serif}
\babelprovide[import]{tibetan}
\babelfont[tibetan]{rm}[Language=Default]{Noto Serif Tibetan}
\begin{document}

\begin{otherlanguage}{tibetan}

% works
འཁོར་ལོ་བདེ་མཆོག་ཏེ་ལོ་པའི།་ཕྱག་རྒྱ་ཆེན་པོའི་ས་བཅད་འདི།་རང་བྱུང་རོལ་པའི་རྡོ་རྗེས་གཏབ།་ཀུན་གྱིས་ཕྱག་ཆེན་རྟོགས་པར་ཤོག་བཀྲ་ཤིས་དཔལ་འབར་འཛམ་གླིང་རྒྱན་དུ་ཤོག ༎།

% breaks
འཁོར་ལོ་བདེ་མཆོག་ཏེ་ལོ་པའི།་ཕྱག་རྒྱ་ཆེན་པོའི་ས་བཅད་འདི།་རང་བྱུང་རོལ་པའི་རྡོ་རྗེས་གཏབ།་ ཀུན་གྱིས་ཕྱག་ཆེན་རྟོགས་པར་ཤོག་བཀྲ་ཤིས་དཔལ་འབར་འཛམ་གླིང་རྒྱན་དུ་ཤོག ༎།

\end{otherlanguage}

\end{document}

Error message with TeX Live 2023:

This is LuaHBTeX, Version 1.17.0 (TeX Live 2023) 
 restricted system commands enabled.
(/articles/test_minimal_breaks.tex
LaTeX2e <2022-11-01> patch level 1
 L3 programming layer <2023-05-22>
(/opt/texlive/texdir/texmf-dist/tex/latex/base/article.cls
Document Class: article 2022/07/02 v1.4n Standard LaTeX document class
(/opt/texlive/texdir/texmf-dist/tex/latex/base/size10.clo))
(/opt/texlive/texdir/texmf-dist/tex/latex/fontspec/fontspec.sty
(/opt/texlive/texdir/texmf-dist/tex/latex/l3packages/xparse/xparse.sty
(/opt/texlive/texdir/texmf-dist/tex/latex/l3kernel/expl3.sty
(/opt/texlive/texdir/texmf-dist/tex/latex/l3backend/l3backend-luatex.def)))
(/opt/texlive/texdir/texmf-dist/tex/latex/fontspec/fontspec-luatex.sty
(/opt/texlive/texdir/texmf-dist/tex/latex/base/fontenc.sty)
(/opt/texlive/texdir/texmf-dist/tex/latex/fontspec/fontspec.cfg)))
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/babel.sty
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/luababel.def)
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/luababel.def)
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/nil.ldf))
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/locale/en/babel-english.tex)
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/locale/bo/babel-tibetan.tex)
(/opt/texlive/texdir/texmf-dist/tex/generic/babel/locale/bo/babel-tibetan.tex

Package babel Warning: Tibetan line breaking and justification are tentative.
(babel)                They might not work as expected and their behavior
(babel)                could change in the future. Feel free to contribute.
(babel)                Reported on input line 19.

) (./test_minimal_breaks.aux

Package babel Info: The following font families will use the default
(babel)             settings for all or some languages:
(babel)             * \sffamily = NotoSans(0)
(babel)               NotoSans:mode=node;script=latn;language=dflt;+tlig;
(babel)             * \ttfamily = lmtt
(babel)               [lmmono10-regular]:
(babel)             There is nothing intrinsically wrong with it, but
(babel)             'babel' will no set Script and Language, which could
(babel)             be relevant in some languages. If your document uses
(babel)             these families, consider redefining them with \babelfont.
(babel)             Reported on input line 3.

) (/opt/texlive/texdir/texmf-dist/tex/latex/base/ts1cmr.fd)
Overfull \hbox (10.2201pt too wide) in paragraph at lines 15--16
[]\TU/NotoSerifTibetan(0)/m/n/10 འཁོར ་ ལོ ་ བདེ ་ མ
ཆོག ་ ཏེ ་ ལོ ་ པའི། ་ ཕ󰘆ག ་ རྒྱ 
་ ཆེན ་ པ󰒤འ󰒞 ་ ས ་ བཅད ་ འདི། ་ 
རང ་ བ󰘆󰒻ང ་ རོལ ་ པའི ་ ར󰕲ོ ་ ར
󰔒ེས ་ གཏབ། ་ ཀ󰒻ན ་ ག󰘆ིས ་

warning  (hyphenation): bad specification: ...texdir/texmf-dist/tex/generic/bab
el/babel-transforms.lua:341: You cannot set field char in a node of type glue
.
<argument> ...ype:D \tex_hskip:D \c_zero_dim \fi: \tex_par:D 
                                                  \hook_use:n {para/after}\@...

l.19 

 4199 words of node memory still in use:
   6 hlist, 1 rule, 2 local_par, 1 dir, 128 glue, 3 kern, 59 penalty, 380 glyph
, 17 attribute, 48 glue_spec, 10 attribute_list, 1 temp, 3 write nodes
   avail lists: 2:1,4:1,5:7
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on test_minimal_breaks.log.
jbezos commented 10 months ago

As a workaround, try replacing in babel-tibetan.tex the following two lines (43-44):

  \babelprehyphenation{tibetan}{^^^^0f0b([^ ^^^^0f0d^^^^0f0e])}%
    { {insert, penalty=10000}, {insert, space=\bbl@tempe, data=1}, {},

with

  \babelprehyphenation{tibetan}{{a}^^^^0f0b([^ ^^^^0f0d^^^^0f0e])}%
    { {}, {insert, penalty=10000}, {insert, space=\bbl@tempe, data=1}, {},

Edit See the following answer.

jbezos commented 10 months ago

🤔 Not quite correct. Try replacing:

  \babelprehyphenation{tibetan}{^^^^0f0b([^ ^^^^0f0d^^^^0f0e])}%
    { {insert, penalty=10000}, {insert, space=\bbl@tempe, data=1}, {},
      {insert, space=\bbl@tempe, data=1}, {string = {1}} }}

with

  \babelprehyphenation{tibetan}{^^^^0f0b[^ ^^^^0f0d^^^^0f0e]}%
    { {insert, penalty=10000}, {insert, space=\bbl@tempe, data=1}, {},
      {insert, space=\bbl@tempe, data=1}, {} }}