Switch on UTF8 encoding

dbeyer commented 6 years ago

Since ACMART is internationally used, by many people with accents in author names and affiliations, it might be a good idea to switch on uft8 input encoding by default, in order to relieve the authors from one additional config line in each of their main.tex files. Is it possible to include the following in the acmart class file? \RequirePackage[utf8]{inputenc}

krono commented 6 years ago

Or even

  \RequirePackage[utf8]{inputenx}
  \input{ix-utf8enc.dfu}

borisveytsman commented 6 years ago

Adding this line will make life marginally better for the authors who use UTF-8 in input. However, it will make life tremendously more hard for the authors who do not: for example, just adding an epigraph in Greek and using LGR.

krono commented 6 years ago

Isn't LGR a font encoding, not an input encoding?

(I have a class where precisely because of that issue I use this:

\RequirePackage[LGR,OT1,LY1,T1]{fontenc}
\RequirePackage[utf8]{inputenx}
\input{ix-utf8enc.dfu}
\RequirePackage{alphabeta}

)

borisveytsman commented 6 years ago

Yes, you are right

Ok suppose you want a Russian citation and use koi8.

krono commented 6 years ago

hm, is the Russian support so bad with UTF8 in LaTeX?

borisveytsman commented 6 years ago

Many people use koi-8 and cp1251 by tradition.

krono commented 6 years ago

\documentclass[sigconf,russian,english]{acmart}

\RequirePackage[LGR,OT1,LY1,T2A,T1]{fontenc}
\RequirePackage[utf8]{inputenx}
\input{ix-utf8enc.dfu}
\RequirePackage{alphabeta}
\usepackage{babel}
\begin{document}

\foreignlanguage{russian}{я не знаю}
\end{document}

this seems to work for short parts

krono commented 6 years ago

Also, shouldn't we advocate to only use utf-8? I persnoally think it is worthwile.

borisveytsman commented 6 years ago

We have thousands of authors. I am not sure we are in a position to forcefully advocate input encodings: do we want a long holy war among some passionate ones? I would rather give as much freedom to authors as I can.

krono commented 6 years ago

That's true. But with this reasoning, acmart also would have to consider XeTeX and LuaTeX, or even Biblatex instead of BibTeX, right?

borisveytsman commented 6 years ago

absolutely. This is in my plans

krono commented 6 years ago

Ping me if I can help :)

borisveytsman commented 6 years ago

Thanks!

zackw commented 6 years ago

Please note that neither inputenc nor inputenx sets the space factor of Unicode close curly quotes correctly. (\sfcode’ is 0, but \sfcode” is 1000. Both should be zero.) Therefore, if you do take the Unicode input plunge, please also add

% inputenc doesn't set the space factor of Unicode close curly quote
% correctly.  (In current versions, the \sfcode of ’ is 0, but the
% \sfcode of ” is still 1000.  Both should be zero.)
\AtBeginDocument{%
  \sfcode\csname\encodingdefault\string\textquotedblright\endcsname=0%
  \sfcode\csname\encodingdefault\string\textquoteright\endcsname=0%
}

to an appropriate place in the class file.

borisveytsman commented 6 years ago

Should we rather press upstream authors (L3 team for inputenx) to make changes rather than working around bugs? Could you write to latex-team@latex-project.org?

zackw commented 6 years ago

I wrote them a note and cc:ed you.

rionda commented 2 years ago

Is this question still relevant since utf-8 encoding is the default in LaTeX since 2018 (see https://tug.org/TUGboat/tb39-1/tb121ltnews28.pdf) ?

borisveytsman commented 1 week ago

This is now obsolete due to changes in LaTeX kernel

borisveytsman / acmart

Switch on UTF8 encoding #242