Closed JulienPalard closed 2 years ago
Thanks for your report. Confirmed on my own build of TeX Live 2020/dev from r51250 on darwin. I guess this is a problem in pTeX, not in pLaTeX.
Consider the following plain pTeX source (test.tex):
\message{ſ}\x
ſ\bye
Compiling this source shows on the terminal:
$ ptex test
This is pTeX, Version 3.14159265-p3.8.2 (utf8.euc) (TeX Live 2020/dev) (preloaded format=ptex)
restricted \write18 enabled.
(./test.tex 顛
! Undefined control sequence.
l.1 \message{^^c5^^bf}\x
? x
The first "ſ" is converted to "顛", and the second "ſ" is converted to "^^c5^^bf". Internally pTeX (more precisely, the built-in library named "ptexenc") converts UTF-8 inputs to EUC-JP or Shift-JIS, so there might be some problem in that conversion.
Anyway, we will discuss the problem in texjporg/tex-jp-build#80, which provides the upstream source of pTeX and ptexenc.
Using:
(from Debian Buster)
with the following test.tex file:
(sha1 starting with b2fa881)
Running:
platex -kanji=utf8 -recorder test.tex
gives me:Which I doubly don't understand:
顛
is NOTU+C4CF
:and
And both characters are not in my file, my file contains only ASCII and
ſ
(U+017F LATIN SMALL LETTER LONG S).So, what did I get wrong? For some context, I'm trying to build the cpython re module japanese translation, which contains both japanese characters AND ſ in an example.
It works with:
from Ubuntu bionic though.