jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.65k stars 3.38k forks source link

Error in the LaTeX to DOCX converter {\color{incolor}In [{\color{incolor}}]:} #1907

Closed alkhwarizmi closed 6 years ago

alkhwarizmi commented 9 years ago

After conversion from a LaTeX document (generated automatically by the ipython notebook exporter) the latex sections referring to the original python code are left probably unconverted in the output.

The smallest code snippet i got in the document is the following

{\color{incolor}In [{\color{incolor}}]:}

I think the problem could be that the filter does not have yet an infrastructure to manage colors. Everyone using pandoc to convert ipython notebooks to docx document would be better off getting the code in black with a different style (fixed size character) in the meantime the color infrastructure gets implemented.

I can provide larger code snippets for testing if required.

Also I have a latex equation which is not translated correctly in the docx format. It is not easy to see a reason why the formula should be translated erroneously. Shall I file another bug for that?

Thank you for the job, Pandoc is great!

mpickering commented 9 years ago

Can you please provide a slightly bigger code snippet? Just to check, your workflow is IPython -> LaTeX -> Pandoc -> Docx?

alkhwarizmi commented 9 years ago

Thank you mpickering!

I confirm that my workflow is IPython notebook (.ipynb) -> ipython nbconvert -> Latex Document (.tex) -> Pandoc -> Docx document (.docx).

I prepared a smaller version of my document and I attach the latex source and the docx result. I have commented the docx file to mark the problems, re-assumed below:

1) The first equation does not read correctly 2) The python code is not readable 3) The picture is not inserted

P.S.: I send you the material via email as I don't understand how to attach things on this platfom.

alkhwarizmi commented 9 years ago

I forgot to mention some the technical details.

The exact command lines given to produce the are:

ipython nbconvert DebugConversion.ipynb --to latex --post PDF pandoc -f latex -t docx -o DebugConversion_flatex.docx DebugConversion.tex

The version of the programs is the following

ipython nbconvert --version 2.2.0

pandoc --version pandoc 1.13.1 Compiled with texmath 0.8, highlighting-kate 0.5.8.5. Syntax highlighting is supported for the following languages: actionscript, ada, apache, asn1, asp, awk, bash, bibtex, boo, c, changelog, clojure, cmake, coffee, coldfusion, commonlisp, cpp, cs, css, curry, d, diff, djangotemplate, doxygen, doxygenlua, dtd, eiffel, email, erlang, fortran, fsharp, gcc, gnuassembler, go, haskell, haxe, html, ini, isocpp, java, javadoc, javascript, json, jsp, julia, latex, lex, literatecurry, literatehaskell, lua, makefile, mandoc, markdown, matlab, maxima, metafont, mips, modelines, modula2, modula3, monobasic, nasm, noweb, objectivec, objectivecpp, ocaml, octave, pascal, perl, php, pike, postscript, prolog, pure, python, r, relaxngcompact, restructuredtext, rhtml, roff, ruby, rust, scala, scheme, sci, sed, sgml, sql, sqlmysql, sqlpostgresql, tcl, texinfo, verilog, vhdl, xml, xorg, xslt, xul, yacc, yaml Default user data directory: C:\Users\asoppelsa\AppData\Roaming\pandoc Copyright (C) 2006-2014 John MacFarlane Web: http://johnmacfarlane.net/pandoc This is free software; see the source for copying conditions. There is no warranty, not even for merchantability or fitness for a particular purpose.

I am running the things on Windows 7.

Cheers

mpickering commented 9 years ago

I didn't receive an email, could you please paste the latex source in a gist and post the link here?

alkhwarizmi commented 9 years ago

Dear Matthew,

I try again to send the material via e-mail. I will also do what you ask.

Cheers

On 2 February 2015 at 20:39, Matthew Pickering notifications@github.com wrote:

I didn't receive an email, could you please paste the latex source in a gist https://gist.github.com/ and post the link here?

— Reply to this email directly or view it on GitHub https://github.com/jgm/pandoc/issues/1907#issuecomment-72522688.

alkhwarizmi commented 9 years ago

Here is the gist link:

https://gist.github.com/alkhwarizmi/80cb7ba3524c61765c32

And here is a google drive link to get the pdf and docx document as well:

https://drive.google.com/folderview?id=0B9GtA85Qi8QYX3pkQkl1ekhhN1U&usp=sharing

Hope it's enough. Thank you!

jgm commented 9 years ago

On closer examination: the issue is not with \color but with the use of

 \begin{Verbatim}[commandchars=\\\{\}]

which is a special fancyvrb environment that allows commands to be used inside a verbatim environment. Pandoc can't really handle this, because it doesn't allow formatting inside code blocks, and doesn't purport to handle fancyvrb environments.

DRuffer commented 9 years ago

Hmm, I seem to be getting this same issue, and according to http://pandoc.org/README.html, you do purport to handle fancyvrb, under the "Creating a PDF" section. Since I'm using MikTeX on Win10, I'm not sure how to accomplish the other suggestion I found to use xelatex.

So, where does this issue stand at this point?

jgm commented 9 years ago

Pandoc uses fancyvrb on the output side (creating latex/pdf). This issue concerns latex on the input side.

DRuffer commented 9 years ago

I'm using ipython notebook, so I'm not sure I care what "side" the problem is on. I just want a working solution!

mb21 commented 6 years ago

I was just about to close this, as it's a really old issue. But when I run the above gist, pandoc 2.0.5 consumes more and more memory till it's killed. The problematic part is the following snippet. Pipe it to pandoc -f latex -t native

\documentclass{article}

\def\PY@reset{\let\PY@it=\relax \let\PY@bf=\relax%
    \let\PY@ul=\relax \let\PY@tc=\relax%
    \let\PY@bc=\relax \let\PY@ff=\relax}
\def\PY@tok#1{\csname PY@tok@#1\endcsname}
\def\PY@toks#1+{\ifx\relax#1\empty\else%
    \PY@tok{#1}\expandafter\PY@toks\fi}
\def\PY@do#1{\PY@bc{\PY@tc{\PY@ul{%
    \PY@it{\PY@bf{\PY@ff{#1}}}}}}}
\def\PY#1#2{\PY@reset\PY@toks#1+\relax+\PY@do{#2}}
mb21 commented 6 years ago

The snippet above parses successfully now...