sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.62k stars 2.13k forks source link

Unicode math rendering in epub? #1005

Open shimizukawa opened 9 years ago

shimizukawa commented 9 years ago

I want to use epub builder with math.

I found that mathjax currently is not supported by epub. So I swithed to pngmath.

But it doesn't render mathjax's unicode greek letter I like to use. If there is a work around?

I tried to make epub builder use xelatex to build the images of formulas -- but I couldn't find where the latex is called in the source code.


shimizukawa commented 9 years ago

From Boris Kheyfets on 2012-09-15 11:27:37+00:00

The investigation led me to the pngmath extension.

In response to the document with this:

{{{ Math test α: :math:α^2. }}}

With pngmath enabled {{{make html}}} gives:

{{{ ! Undefined control sequence. \u-default-945 #1->\textalpha

l.13 $α ^2$ 1 ) (see the transcript file for additional information) Output written on math.dvi (1 page, 208 bytes). Transcript written on math.log. }}}

But one can modify the {{{DOC_HEAD}}} of {{{pngmath.py}}} to contain also:

{{{ \usepackage{newunicodechar} \newunicodechar{α}{\alpha} \newunicodechar{β}{\beta} \newunicodechar{γ}{\gamma} \newunicodechar{δ}{\delta} \newunicodechar{ε}{\varepsilon} \newunicodechar{ζ}{\zeta} \newunicodechar{η}{\eta} \newunicodechar{θ}{\theta} \newunicodechar{ι}{\iota} \newunicodechar{κ}{\kappa} \newunicodechar{ϰ}{\varkappa} \newunicodechar{λ}{\lambda} \newunicodechar{μ}{\mu} \newunicodechar{ν}{\nu} \newunicodechar{ξ}{\xi} \newunicodechar{π}{\pi} \newunicodechar{ϱ}{\varrho} \newunicodechar{ρ}{\rho} \newunicodechar{σ}{\sigma} \newunicodechar{τ}{\tau} \newunicodechar{υ}{\ypsilon} \newunicodechar{φ}{\varphi} \newunicodechar{ϕ}{\phi} \newunicodechar{χ}{\chi} \newunicodechar{ψ}{\psi} \newunicodechar{ω}{\omega} }}}

Then one can get a little further during compilation:

{{{ Exception occurred: File "/usr/lib/python2.7/dist-packages/sphinx/builders/html.py", line 419, in write_doc self.docwriter.write(doctree, destination) File "/usr/local/lib/python2.7/dist-packages/docutils/writers/init.py", line 80, in write self.translate() File "/usr/lib/python2.7/dist-packages/sphinx/writers/html.py", line 38, in translate self.document.walkabout(visitor) File "/usr/local/lib/python2.7/dist-packages/docutils/nodes.py", line 173, in walkabout if child.walkabout(visitor): File "/usr/local/lib/python2.7/dist-packages/docutils/nodes.py", line 173, in walkabout if child.walkabout(visitor): File "/usr/local/lib/python2.7/dist-packages/docutils/nodes.py", line 173, in walkabout if child.walkabout(visitor): File "/usr/local/lib/python2.7/dist-packages/docutils/nodes.py", line 165, in walkabout visitor.dispatch_visit(self) File "/usr/local/lib/python2.7/dist-packages/docutils/nodes.py", line 1611, in dispatch_visit return method(node) File "/usr/lib/python2.7/dist-packages/sphinx/ext/pngmath.py", line 217, in html_visit_math fname, depth = render_math(self, '$'+node['latex']+'$') File "/usr/lib/python2.7/dist-packages/sphinx/ext/pngmath.py", line 121, in render_math latex += (use_preview and DOC_BODY_PREVIEW or DOC_BODY) % math UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 223: ordinal not in range(128) }}}

So now it's somewhere in docutils not being able to process utf8.

shibukawa commented 7 years ago

pngmath is old extension. imgmath is alternative one. See #3625 to see the latest status of this issue.

tk0miya commented 7 years ago

It seems sphinx.ext.imgmath has same problem. I think this is not resolved yet.

Note: #3625 is a different from this.

tk0miya commented 7 years ago

@jfbu could you give us your advice for generating an equation image from TeX mark-ups? Now imgmath extension uses the combination of latex and dvipng (or dvisvgm).

It generates the following .tex code internally:

\documentclass[12pt]{article}
\usepackage[utf8x]{inputenc}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{amssymb}
\usepackage{amsfonts}
\usepackage{anyfontsize}
\usepackage{bm}
\pagestyle{empty}
\begin{document}
\fontsize{%d}{%d}\selectfont %s
\end{document}

As commented above, newunicodechar package certainly resolves this situation. But I feel the codemap is a bit ugly. Is there a better way to compile utf-8 equations to images?

tk0miya commented 7 years ago

Note: I succeeded to convert an image from equation including "α" character with following patch:

diff --git a/sphinx/ext/imgmath.py b/sphinx/ext/imgmath.py
index d57502fc3..0d36c120b 100644
--- a/sphinx/ext/imgmath.py
+++ b/sphinx/ext/imgmath.py
@@ -43,7 +43,7 @@ class MathExtError(SphinxError):

 DOC_HEAD = r'''
 \documentclass[12pt]{article}
-\usepackage[utf8x]{inputenc}
+\usepackage[utf8]{inputenc}
 \usepackage{amsmath}
 \usepackage{amsthm}
 \usepackage{amssymb}
@@ -51,6 +51,8 @@ DOC_HEAD = r'''
 \usepackage{anyfontsize}
 \usepackage{bm}
 \pagestyle{empty}
+\usepackage{newunicodechar}
+\newunicodechar{α}{\alpha}
 '''

 DOC_BODY = r'''
@@ -92,7 +94,7 @@ def render_math(self, math):

     font_size = self.builder.config.imgmath_font_size
     use_preview = self.builder.config.imgmath_use_preview
-    latex = DOC_HEAD + self.builder.config.imgmath_latex_preamble
+    latex = DOC_HEAD.decode('utf-8') + self.builder.config.imgmath_latex_preamble
     latex += (use_preview and DOC_BODY_PREVIEW or DOC_BODY) % (
         font_size, int(round(font_size * 1.2)), math)
jfbu commented 7 years ago

If the matter is Greek letters, one can achieve that by loading package alphabeta:

% -*- coding: utf-8; -*-
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{alphabeta}
\begin{document}

\[ \tan ß = \frac{\sin ß}{\cos ß}\]

\[ μλ = δχ \]

\thispagestyle{empty}
\end{document}

produces via latex + dvipng this image: tempgreek1

(I use dvipng -Ttight -D150 foo.dvi and option -bgTransparent for transparent images)

Else, for more complete Unicode support, I guess one should go to either xetex with mathspec, but then dvipng can not be used, (but ghostscript can convert from pdf to png), or xetex with unicode-math or lualatex with unicode-math. With the latter solution, dvi can be produced which then can be processed by dvipng or other tools. But I am not very much familiar with unicode-math. It brings a lot of overhead because it does many things.

I will look for alternatives; personally I am so much used to macros that I was never tempted by Unicode literals in math mode.

jfbu commented 7 years ago

Apologies I used the wrong beta Unicode letter in my previous comment. It should be:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{alphabeta}
\begin{document}
\[ \tan β = \frac{\sin β}{\cos β}\]
\[ μλ = δχ \]
\end{document}

and produces tempgreek-1

But this method handles only Greek letters. Also, it is currently not compatible with utf8x (again this problem of \DeclareUnicodeCharacter not having same interface with utf8 or with utf8x).

jfbu commented 7 years ago

With package alphabeta+pdflatex one should drop the escaping of Greek letters done in https://github.com/sphinx-doc/sphinx/blob/master/sphinx/util/texescape.py. Indeed, in text mode the Greek letters will now be rendered by the LGR encoded Computer Modern font, (upright letters). In math, Sphinx does no escaping anyhow. Then alphabeta maps the Greek letters to the corresponding macros.

When using xelatex or lualatex, one still should not do any escaping of Greek letters. The user will need to use a font having Greek letters that's all. And for use in math mode, the package unicode-math should be loaded, as it does the needed work. This package is for using an OpenType math font with xelatex or lualatex. But as part of this it does the mapping from Greek letters to the traditional Greek symbols of TeX.

When using platex I don't know.