michal-h21 / make4ht

Build system for tex4ht
131 stars 15 forks source link

make4ht does not generate utf8 title tag #123

Closed linchengcs closed 1 year ago

linchengcs commented 1 year ago

for example

    \documentclass{article}
    \tite{中文}
    \begin{document|
    \maketitle
    bla bla
    \end{document}

compile with make4ht -ux file.tex , it does not generate an empty title tag in the html file.

michal-h21 commented 1 year ago

TeX4ht in the XeTeX mode needs the character groups outside of the basic Latin to be declared. CJK languages are automatically declared when you use for example the xeCJK package. You can also use the \xeuniuseblock in a configuration file:

\Preamble{xhtml}
\xeuniuseblock{Chinese}
\begin{document}
\EndPreamble

But the support is still limited, so it doesn't support use of Unicode characters in macros (which is used internally in \title). So it is better to use LuaTeX instead (using the -l option), because it doesn't have these limitations.

linchengcs commented 1 year ago

Thank you for response, michal-h21!

I tried the two solutions, but still have the following problem. First solution, using the configuration file as you provided. My test file test.tex is :

    \documentclass{article}                                                                                                                                                                                   
    \usepackage{xeCJK}                                                                                                                                                                                        
    \title{中文}                                                                                                                                                                                              
    \date{}                                                                                                                                                                                                   
    \author{}                                                                                                                                                                                                 
    \begin{document}                                                                                                                                                                                          
    \maketitle                                                                                                                                                                                                
    中文                                                                                                                                                                                                      
    \end{document}  

The configuration file, make4ht.config is as you provided:

    \Preamble{xhtml}
    \xeuniuseblock{Chinese}
    \begin{document}
    \EndPreamble

, compiie with make4ht -c make4ht.config -ux test.tex, but the title tag is not generated (empty).

The second option, using luatex. The test.tex file is

    \documentclass{article}                                                                                                                                                                                   
    \usepackage{xeCJK}                                                                                                                                                                                        
    \title{中文}                                                                                                                                                                                              
    \date{}                                                                                                                                                                                                   
    \author{}                                                                                                                                                                                                 
    \begin{document}                                                                                                                                                                                          
    \maketitle                                                                                                                                                                                                
    中文                                                                                                                                                                                                      
    \end{document}  

,compiled with make4ht -lx test.tex, but there are errors:

    [STATUS]  make4ht: Conversion started
    [STATUS]  make4ht: Input file: test.tex
    [ERROR]   htlatex: Compilation errors in the htlatex run
    [ERROR]   htlatex: Filename Line    Message
    [ERROR]   htlatex: ./test.tex   11   You can't use `\relax' after \the.
    [ERROR]   htlatex: /usr/share/texlive/texmf-dist/tex/luatex/luatexja/luatexja-core.sty  180  Package luatexja Error: DVI output is not supported in LuaTeX-ja.
    [ERROR]   htlatex: Compilation errors in the htlatex run
    [ERROR]   htlatex: Filename Line    Message
    [ERROR]   htlatex: ./test.tex   11   You can't use `\relax' after \the.
    [ERROR]   htlatex: /usr/share/texlive/texmf-dist/tex/luatex/luatexja/luatexja-core.sty  180  Package luatexja Error: DVI output is not supported in LuaTeX-ja.
    [ERROR]   htlatex: Compilation errors in the htlatex run
    [ERROR]   htlatex: Filename Line    Message
    [ERROR]   htlatex: ./test.tex   11   You can't use `\relax' after \the.
    [ERROR]   htlatex: /usr/share/texlive/texmf-dist/tex/luatex/luatexja/luatexja-core.sty  180  Package luatexja Error: DVI output is not supported in LuaTeX-ja.
    [STATUS]  make4ht: Conversion finished

This test.tex file compiles well with command lualatex test.tex, but does not compile with luatex test.tex.

Any further suggestions, please?

linchengcs commented 1 year ago

I find that the titles for the section and subsection html files are generated, but the top page is not.
Below is the test.tex file

    \documentclass{article}                                                                                                                                                                                   
    \usepackage{xeCJK}                                                                                                                                                                                        

    \title{中文}                                                                                                                                                                                              
    \date{}                                                                                                                                                                                                   
    \author{}                                                                                                                                                                                                 

    \begin{document}                                                                                                                                                                                          
    \maketitle                                                                                                                                                                                                
    中文                                                                                                                                                                                                      

    \section{中文}  % this title is generated                                                                                                                                                                 
    中文                                                                                                                                                                                                      

    \subsection{中文} % this title is generated                                                                                                                                                               
    中文                                                                                                                                                                                                      

    \end{document}     

Compile with make4ht -ux test.tex "html5, 3". 3 html files are generated test.html testse1.html testsu1.html, the first has an empty title tag, and the last two have the correct title tag.

michal-h21 commented 1 year ago

The section titles are handled differently than titles from \title, so it is possible that they work. But I got to work your example. I only had to modify the configuration file for xeCJK. Save this file as xecjk-hooks.4ht:

% xecjk-hooks.4ht (2023-06-06-13:39), generated from tex4ht-4ht.tex
% Copyright 2020 TeX Users Group
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
%   http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
%
% This work has the LPPL maintenance status "maintained".
%
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
%
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2023-06-06-13:39}

\:dontusepackage{xeCJK}
\:AtEndOfPackage{%
  \RequirePackage{fontspec}
}
\DeclareDocumentCommand\setCJKmainfont{o m o}{}
\let\setCJKsansfont\setCJKmainfont
\let\setCJKmonofont\setCJKmainfont

\DeclareDocumentCommand\setCJKfamilyfont {m o m }{}
\DeclareDocumentCommand\newCJKfontfamily {o m o m}{\expandafter\gdef\csname #2\endcsname{\relax}}

\DeclareDocumentCommand\xeCJKsetup{m}{}
% }
\AtBeginDocument{%
  \ifdefined\xeuniuseblock%
  \xeuniuseblock{CJK}%
  \fi%
}

I could compile it using

  $ make4ht -l test.tex
linchengcs commented 1 year ago

It works! Thank you . michal-h21!

linchengcs commented 1 year ago

closing this issue...