Open theasder opened 6 years ago
See my answer here: https://stackoverflow.com/a/49582428/2372611
The problem is Jupyter uses xelatex
command to compile latex (to support Unicode, I think). But the problem is there is no need for xelatex
for the generated file, it can be directly compiled with latex
or pdflatex
with Unicode support. I think the file generated does not have the configurations needed for xelatex
to evaluate Unicode characters.
jupyter unicode convert pdf
Can someone please produce a minimum example notebook.ipynb and/or provide a copy of the latex output from:
Jupyter nbconvert --to latex notebook.ipynb
If I have a file to work with, then I can investigate this.
I believe this is relevant. The April 2018 release of LaTeX defaults to utf-8 encoding.
If I can get a file and replicate the issue, I may be able to solve this.
I invensigated the problem a bit, and the main issue seems not to stem from the fact that the UTF8 is not correctly recognized. The actual problem is that the main font does not have the corresponding glyphs.
Jupyter uses the mathpazo
package to load URW Palladio. But that font does not cover many scripts. Using DejaVu Sans instead, which covers a wide range of unicode scripts, fixed the problem for me (still not covering cases like RTL languages, but that’s another problem).
The problem is that DejaVu Sans is not exactly pretty, and this would affect all documents, even those who don’t use non-latin scripts.
A possible solution seems to be the ucharclasses package. That allows to define separate fonts for different unicode blocks. That way, the main (latin) font could be left as it is, only specifying fallback fonts for other scripts.
The Noto fonts might be a viable set of fonts for non-latin blocks.
I just spent some time exploring ucharclass
, and I believe that I can make this work.
If I added the following:
\usepackage[Latin,Greek]{ucharclasses}
\usepackage{fontspec}
\newfontfamily{\mynormal}{Latin Modern Roman}
\setDefaultTransitions{\mynormal}{}
\newfontfamily{\mygreek}{Courier New}
\setTransitionsForGreek{\mygreek}{}
to the bottom of the preamble (It messes with section titles if I put \usepackage{fontspec}
any earlier), then symbols like θα
work properly. I see no reason why this wouldn't work for other Unicode blocks. We just need to agree on fonts for the different blocks. I would also like to figure out how to store the default font instead of overriding it.
This will also only work in XeLaTeX, so it would be smart to wrap this is some
\ifdefined\XeLaTeXonlycommand
...
\fi
This way it is still possible to compile the latex file using pdflatex.
CC @mpacer
I just noticed an issue with this solution. \setDefaultTransitions{\mynormal}{}
will change to a non-monospaced font for any latin characters including inside cell inputs/outputs. This could be changed to \setDefaultTransitions{\ifcell\somemonofont\else\mynormal\fi}{}
, but then every single verbatim environment needs to be wrapped with \celltrue … \cellfalse
where cell
is defined by \newif\ifcell
.
When using xelatex (or lualatex) the preamble correctly loads the unicode-math
package. But that is the only font setting it has. The OpenType fonts loaded by unicode-math
are good for the math part, but are limited in other respects. In particular Latin Modern Mono does not include Greek or Cyrillic.
If we replace \usepackage{unicode-math}
with \usepackage[default]{fontsetup}
we get all the goodness of unicode-math
(because fontsetup loads unicode-math), but we get the full cm-unicode fonts for all of the text (including monospaced) which includes Greek and Cyrillic.
See my StackExchange answer for more detail.
Hey there,
I have jupyter notebook from anaconda on Ubuntu 16.04 with installed xetex. I tried to convert it to PDF and it fetched successfully all english words and formulas, but there were some utf-8 symbols and it ignored it. Some logs here:
It generated valid tex file, but no multilanguage support in it.