angerhang / mockDoc

0 stars 0 forks source link

Comments about Chinese support #11

Closed jinbozz closed 9 years ago

jinbozz commented 9 years ago

Surprisingly, it seems perfectly OK when using LaTeXML.

test.tex:

\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
    这是一些简体汉字。% These are some simplified Chinese words, used in mainland China & Singapore.
    這是一些繁體漢字。% These are some traditional Chinese words, used in Taiwan, Hongkong & Macau.
\end{document}

where <\usepackage[utf8]{inputenc}> is important for getting right characters.

test.xml:

<?xml version="1.0" encoding="UTF-8"?>
<?latexml searchpaths="/home/la_stravaganza/repos/mockDoc/ChineseSupport"?>
<?latexml class="article"?>
<?latexml package="inputenc" options="utf8"?>
<?latexml RelaxNGSchema="LaTeXML"?>
<document xmlns="http://dlmf.nist.gov/LaTeXML">
  <resource src="LaTeXML.css" type="text/css"/>
  <resource src="ltx-article.css" type="text/css"/>
  <para xml:id="p1">
    <p>这是一些简体汉字。<!-- %These are some simplified Chinese words, used in mainland China & Singapore. -->這是一些繁體漢字。<!-- %These are some traditional Chinese words, used in Taiwan, Hongkong & Macau. --></p>
  </para>
</document>

However, this tex file is absolutely not OK for pdfLaTeX.

To make pdfLaTeX work right, more packages & configurations are needed, I successfully did that before, in Windows, tomorrow I will try it on Linux.

So that means, we can't have a single tex file for both pdfLaTeX and LaTeXML, if Chinese characters are involved. But fortunately there won't be huge changes, only some minor changes in the beginning.

Here is the beginning of a tex file I did before:

\documentclass[UTF8,11pt]{article}
\usepackage{ctex}
\begin{document}
...
\end{document}

where ctex stands for Chinese TeX.

kohlhase commented 9 years ago

adding @dginev to the conversation. I think that this is an interesting observation, that really belongs onto the LaTeXML issue tracker under the heading "make chinese LaTeX work". My take on this would be that the canonical (and any uncanonical way) of writing chinese LaTeX should just work in LaTeXML.

At first sight, it seems that if we make a ctex.sty.ltxml package that just calls \usepackage[utf8]{inputenc} then your example above should just work. But ctex.sty.ltxml is probably more complex than that; and most probably impossible for Bruce/Deyan/me to write, since we cannot read the documentation at http://mirrors.ctan.org/language/chinese/ctex/ctex.pdf .

I am assigning the issue back to Jinbo, please make an issue at https://github.com/brucemiller/LaTeXML where you describe the situation in detail and copy @dginev and me into the conversation, and offer your help with the chinese part. Getting chinese LaTeXML to work would be a very welcome side effect of your work for me.

kohlhase commented 9 years ago

Oh, and I think that we will probably be able to proceed with the chinese sTeX entries by following the route of writing our private ctex.sty.lxtml binding for #8.

Actually, we should move the issues about chinese smglom over to the sTeX repos https://github.com/KWARC/sTeX/issues Would you please do that? Then we can discuss specifics there.

dginev commented 9 years ago

I think @La-Stravaganza said things work just fine in LaTeXML, it is just clunky to get it right in the super old pdflatex. If you want native Unicode in TeX, consider using xelatex instead of pdflatex, that is what I compiled my MSc thesis with.

jinbozz commented 9 years ago

Michael, @dginev was right. I successfully let xelatex support Chinese with proper settings. However, pdflatex failed badly. You can check generated pdf & xml in /ChineseSupport.

jinbozz commented 9 years ago

test.xml.log:

(Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/TeX.pool.ltxml... (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/eTeX.pool.ltxml... 0.00 sec) (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/pdfTeX.pool.ltxml... 0.01 sec) 0.14 sec) latexmlc (LaTeXML version 0.8.0) processing started Wed Feb 11 20:22:26 2015

(Digesting TeX test... (Processing content /home/la_stravaganza/repos/mockDoc/ChineseSupport/test.tex... (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/LaTeX.pool.ltxml... 0.13 sec) (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/article.cls.ltxml... 0.02 sec) (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/inputenc.sty.ltxml... (Loading /usr/local/share/perl/5.18.2/LaTeXML/Package/utf8.def.ltxml... 0.00 sec) 0.01 sec) Error:missing_file:ctex Can't find binding for package ctex at /home/la_stravaganza/repos/mockDoc/ChineseSupport/test.tex; line 4 col 6 search paths are /home/la_stravaganza/repos/mockDoc/ChineseSupport Next token is T_CS[\begin] In \usepackage OptionalSemiverbatim Semi... from LaTeX.pool.ltxml line 543 <= Core::Stomach[@0x156e1c0] 0.21 sec) 0.22 sec) (Building... (Loading compiled schema /usr/local/share/perl/5.18.2/LaTeXML/resources/RelaxNG/LaTeXML.model... 0.01 sec). 0.05 sec) (Rewriting... 0.00 sec) (Finalizing... 0.00 sec) Conversion complete: 1 error; 1 missing file[ctex.sty]. Status:conversion:2

I think ctex.sty.ltxml can fix this problem.

jinbozz commented 9 years ago

About how to write ctex.sty.ltxml, I think I need to talk to you. Maybe tomorrow?

kohlhase commented 9 years ago

I think that ctex.sty does more things than just turn on unicode. It does things that babel would normally do, e.g. customize \today, ... And there are variant classes there as well. It would just be good if LaTeXML would support them as well. I have no problem with requiring xetex for PDF generation.

kohlhase commented 9 years ago

@dginev I think that you misunderstand me you say "just use xelatex", I am happy to do that (not that I write much chinese LaTeX), but if there is a way of writing chinese LaTeX with ctex.sty (and the respective classes), it would be very good, if LaTeXML were to support this (hence the request ctex.sty support). And the fact that we have Hang and Jinbo who understands Chinese and (soon) LaTeXML seems a resource worth exploiting).