Closed pharpend closed 9 years ago
It would help if you gave the version of pandoc you're using, and a complete input sample we could use to reproduce this.
+++ Peter Harpending [Apr 24 15 22:03 ]:
I believe this is a resurfacing of #1866 .
$ pandoc lysa.ltx -f latex -t epub -o lysa.epub pandoc: Error at "input" (line 409, column 39): unexpected "{" expecting letter or lf new-line \ch{Answers to the exercises} ^
I get the same thing when trying to convert to HTML or Markdown.
Reply to this email directly or view it on GitHub: https://github.com/jgm/pandoc/issues/2108
Oops, sorry.
Pandoc 1.13.2.1
There is a whole suite of imported files in the git repo, but the root file is:
https://github.com/learnyou/lysa/blob/master/en/book/lysa.ltx
I can't reproduce this with the dev version (which will be 1.14), so I suspect this has already been fixed. You might try compiling pandoc from source.
I can, albeit with a different error
pandoc lysa.ltx -f latex -t epub -o lysa.epub
pandoc:
Error at "5-more-sets.ltx" (line 59, column 65):
unexpected "}"
expecting "[", "{", "\\", "=" or digit
\addbibresource{lysa.bib}
Here is the bibliography file: https://github.com/learnyou/lysa/blob/master/en/book/bibliographies/lysa.bib
And here is the latex file about which pandoc is complaining: https://github.com/learnyou/lysa/blob/master/en/book/chapters/5-more-sets.ltx
I can also confirm that it compiles fine with 1.14 on Mac & Windows 8.1.
@pharpend I cannot find \addbibresource{lysa.bib}
in https://github.com/learnyou/lysa/blob/master/en/book/chapters/5-more-sets.ltx.
@pharpend 5-more-sets.ltx by itself compiles fine; the error is coming when it is included as part of lysa.ltx. I'll investigate further later.
@nkalvi Yes, I should have been more clear.
I'm on Arch Linux, if that matters.
I'd like to point out that this is not valid LaTeX, so Pandoc is fine. Error messages print wrong line if error happens in included file though, so we'll have to look into that.
As for problem at hand, consider https://github.com/learnyou/lysa/blob/master/en/book/lysa.ltx#L246
Here, newcommand is created without arguments, but obviously later used with a single argument. This is not how \newcommand
is supposed to work. It should be \let
, but I think Pandoc does not support \let
expressions atm. Easy fix for this is as follows:
\newcommand{\inclgraph}[1]{\includegraphics[width=0.8\textwidth]{#1}}
P.S. Bear in mind that #1866 is fixed in master
(a.k.a. 1.14), but not in 1.13.2.1, so latter won't parse this regardless.
@lierdakil Ah stupid me and my Haskell background. I'll fix that, see if this works.
BTW, my file uses a ton of \let
declarations already, so if pandoc doesn't support \let
, then that's another problem entirely.
Well, Pandoc will parse let expressions as raw TeX commands (with --parse-raw
option), so writing a simple find-and-replace filter should not be too hard, if you're somewhat familiar with Haskell (math expressions are represented verbatim in AST though, so some string manipulation will be necessary). You probably could also replace most if not all \let
s with \newcommand
s, since \let
is TeX anyway and is all but deprecated in LaTeX, afaik.
Bear in mind that Pandoc has limited support for LaTeX, so there may be other problems when converting somewhat complicated LaTeX documents (and since LaTeX is a turing-complete language, full support would be extremely hard to implement or even fathom)
IIRC, \let
uses a somewhat different macro-expansion algorithm than \newcommand
. Thus, there are some cases where I actually do want \let
instead of \newcommand
.
At least, that's what whomever added all the first of those \let
declarations told me.
Okay, I improved my build script quite significantly, so testing this should be easier.
@lierdakil
Even if I fix the \newcommand
issue you pointed out, this happens
$ ./lysabuild --clean
$ ./lysabuild --sandbox >& /dev/null
$ cd tmp
$ pandoc lysa.ltx -f latex -t epub -o lysa.epub
pandoc:
Error at "input" (line 409, column 39):
unexpected "{"
expecting letter or lf new-line
\tableofcontents
^
It appears to me that the error has nothing --- or very little --- to do with the actual error, seeing as line 409 is
\ch{Answers to the exercises}
and the \tableofcontents
line doesn't have trailing whitespace.
livid@livid /tmp/test/lysa/en/book/tmp $ ~/work/pandoc/dist/build/pandoc/pandoc -v
pandoc 1.14
Compiled with texmath 0.8.1, highlighting-kate 0.5.14.
...
livid@livid /tmp/test/lysa/en/book/tmp $ git rev-parse HEAD
33624f9e1f779afd1e2a05e180556c2b07286b82
livid@livid /tmp/test/lysa/en/book/tmp $ TEXINPUTS="." ~/work/pandoc/dist/build/pandoc/pandoc -f latex -t epub3 -o /tmp/lysa.epub lysa.ltx
pandoc: Could not find media `{nq-bijection.png}', skipping...
pandoc: Could not find media `{nq-bijection-naive.png}', skipping...
pandoc: Could not find media `{nq-bijection-nolines.png}', skipping...
pandoc: Could not find media `{nz-bijection-joined.png}', skipping...
pandoc: Could not find media `{nz-bijection.png}', skipping...
pandoc: Could not find media `{x-squared-curve.png}', skipping...
pandoc: Could not find media `{VectorGraph2.png}', skipping...
pandoc: Could not find media `{VectorGraph1.png}', skipping...
Note that I have to define TEXINPUTS
only because my system has it set. If TEXINPUTS
is not set, pandoc defaults to ".".
Do confirm that you're using master
pandoc, and not 1.13.2.1. As I said, 1.13.2.1 will not parse this regardless, due to #1866.
@lierdakil With regard to the files-not-found, that's my fault for not being more verbose. See my edits to the original post.
I was using 1.13.2, apparently. I installed pandoc from master
earlier in the day, but some other package I was working on uninstalled that, and instead installed 1.13.2.
headdesk
Okay, now I get this error:
% pandoc lysa.ltx -f latex -t epub3 -o lysa.epub
pandoc: Could not find media `{nq-bijection.png}', skipping...
pandoc: Could not find media `{nq-bijection-naive.png}', skipping...
pandoc: Could not find media `{nq-bijection-nolines.png}', skipping...
pandoc: Could not find media `{nz-bijection-joined.png}', skipping...
pandoc: Could not find media `{nz-bijection.png}', skipping...
pandoc: Could not find media `{x-squared-curve.png}', skipping...
pandoc: Could not find media `{VectorGraph2.png}', skipping...
pandoc: Could not find media `{VectorGraph1.png}', skipping...
Now, that's odd, because all of those files are in the working directory.
Pandoc now makes a very crappy EPub, so I guess that's an improvement!
Ditto for HTML: it makes a crappy HTML file, which I guess is better than no HTML file.
I have to do some tetris with Hakyll, but I'll see if I can put the HTML file up on my website.
Hakyll doesn't support pandoc-1.14
(I tried), so I have to cabal install hakyll
, edit my website.
Could not find media
errors are happening because pandoc is being somewhat stupid. Root of the problem is
\newcommand{\answergraph}[1]{\begin{center}\inclgraph{{#1}}\end{center}}
Note extra curly braces around #1
-- pandoc parses this verbatim as {filename.png}
, which is obviously not present in current directory. Try removing that. You can \usepackage{grffile}
if you want filenames with dots/spaces/etc.
Okay, that seems like another bug with pandoc, should I open a new issue?
Here's the HTML and EPub files pandoc is generating: http://learnyou.org/lysa-dist/
(see lysa.epub
and lysa.ltx
)
You are free to. That said, I'm unsure on how curly braces in arguments should be handled -- I'm no LaTeX expert.
BTW, you'll want to generate HTML with pandoc -s
, since it adds head
, html
, etc.
Looking at your filenames, you don't need curly braces at all, at least not with pdflatex
. So you can safely remove those for the time being.
\newcommand{\answergraph}[1]{\begin{center}\inclgraph{#1}\end{center}}
@pharpend @lierdakil
Looking at http://pandoc.org/README.html#latex-macros
\newcommand{\tuple}[1]{\langle #1 \rangle}
$\tuple{a, b, c}$
The macros needs to be surrounded by $
when used. So surrounding all the macros in 5-more-sets.ltx eliminates all errors and the EPUB is output as expected.
@nkalvi that's about Markdown parser. LaTeX parser is different.
Here is a sample of the HTML pandoc is generating: http://ix.io/i7a . Download that file, and view it in a web browser.
Here: http://learnyou.org/raw/lysa.html
It's not perfect, especially for tables.
The pandoc command I'm using is
pandoc lysa.ltx -f latex -s --mathjax -t html5 \
| sed 's,"//,"http://,' \
| sed 's,http:,https:,' \
> lysa.html
@lierdakil Thanks and welcome back :smile:
The syntax is applicable for LaTex math too, right? (though the macro will not be expanded): http://pandoc.org/README.html#latex-macros
For output formats other than LaTeX, pandoc will parse LaTeX \newcommand and \renewcommand definitions and apply the resulting macros to all LaTeX math. So, for example, the following will work in all output formats, not just LaTeX:
@nkalvi No no, that file is for Markdown input. We are talking about LaTeX input.
@pharpend, well, with tables it's a given, since pandoc doesn't know how to parse tabu
environment. You can see list of all supported block-level environments in https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L1019
You may also want to take a look at inlineEnvironments
, blockCommands
and inlineCommands
(near lines 279 and 413)
As I said, LaTeX support is very limited.
@lierdakil Thank you.
I'm not particularly willing to spend hours editing my book just to appease pandoc. (Rather, it seems more work needs to be done on Pandoc so it will work with stuff like my book).
I'm not in any rush to convert this to an EPub or anything, it would just be nice.
I would vote for leaving this issue open, and fixing pandoc incrementally until it can parse a rather large and complicated LaTeX document (such as my book) flawlessly.
@pharpend @lierdakil I'm sorry about the confusion.
@pharpend If you really want better LaTeX support, consider contributing. Since core team is shorthanded as is, LaTeX Reader is not very likely to dramatically improve anytime soon. At least, if you have any ideas on how to parse complicated LaTeX without basically reimplementing texlive in Haskell, please share those. Sorry if this sounds a little harsh/demanding/disrespectful, English is not my native language, so I struggle a bit with wording. No offence meant.
If you want to leave this issue open, please edit title and first post to reflect that this is now a feature request. Otherwise, it might end up closed as resolved/duplicate.
Anyway, good luck with your book.
P.S. BTW, I skimmed over Russian translation, and it seems a little bit... odd. Like it was translated by a non-native speaker. Just an FYI.
+++ Nikolay Yakimov [Apr 25 15 19:40 ]:
Could not find media
errors are happening because pandoc is being somewhat stupid. Root of the problem is\newcommand{\answergraph}[1]{\begin{center}\inclgraph{{#1}}\end{center}}
Note extra curly braces around
#1
-- pandoc parses this verbatim as{filename.png}
This would be worth fixing.
+++ Peter Harpending [Apr 25 15 19:41 ]:
Okay, that seems like another bug with pandoc, should I open a new issue?
Yes.
+++ Peter Harpending [Apr 25 15 20:48 ]:
I would vote for leaving this issue open, and fixing pandoc incrementally until it can parse a rather large and complicated LaTeX document (such as my book) flawlessly.
No, it's more useful to have bug reports that are focused more precisely on small issues that can be fixed with a relatively small amount of work.
Note that pandoc will do best on latex documents that don't use
lower-level tex primitives like \let
, which for the most part it
doesn't support, and that stick to basic latex + common packages like
amsmath. You can't expect to use arbitrary packages and have pandoc
know about them.
I think adding support for \let
might not be too hard, so this
is a good thing to add an issue for. Ditto the braces in filenames
issue. But I think it's better to close big, vague issues.
One more thing on this: if you really want to be able to produce PDF and EPUB versions of your book, then I'd suggest writing it in pandoc's Markdown, which will give you perfect translations to both those formats. You could use pandoc to get a rough import, which would require manual fixing up.
This sacrifices some control, since LaTeX is much more expressive than pandoc's Markdown, but you can use filters to add little bits of expressive power when needed.
Thank you, @jgm
@jgm Though this issue is closed, I just want to understand handling of newcommand in Pandoc's LaTex reader.
In Pandoc's LaTex reader, a macro which expands to something that requires parameter(s), is expected to be defined with the required parameters. Failure to do so will result in parse error (such as in this case initially). LaTex doesn't seem to have this requirement. I use the example in this issue:
\documentclass{article}
\usepackage{graphicx}
\begin{document}
\thispagestyle{empty}
% Following will work fine in LaTex, but will fail with parse error in Pandoc
% since a parameter is expected (msg: unexpected "}" expecting "[", "{", "\\", "=" or digit)
\newcommand{\inclgraph}{\includegraphics[width=0.8\textwidth]}
\begin{figure}[ht]
\centering
\inclgraph{setminus.png}
\caption{Set subtraction}
\label{fig:setminus}
\end{figure}
\end{document}
It this is a requirement, it would be helpful to have it in the documentation.
@nkalvi, this looks like a bug, maybe open a separate issue with this example?
+++ nkalvi [Apr 26 15 17:59 ]:
@jgm Though this issue is closed, I just want to understand handling of newcommand in Pandoc's LaTex reader.
In Pandoc's LaTex reader, a macro which expands to something that requires parameter(s), is expected to be defined with the required parameters. Failure to do so will result in parse error (such as in this case initially). LaTex doesn't seem to have this requirement. I use the example in this issue:
\documentclass{article} \usepackage{graphicx} \begin{document} \thispagestyle{empty} % Following will work fine in LaTex, but will fail with parse error in Pandoc % since a parameter is expected (msg: unexpected "}" expecting "[", "{", "\\", "=" or digit) \newcommand{\inclgraph}{\includegraphics[width=0.8\textwidth]} \begin{figure}[ht] \centering \inclgraph{setminus.png} \caption{Set subtraction} \label{fig:setminus} \end{figure} \end{document}
It this is a requirement, it would be helpful to have it in the documentation.
Reply to this email directly or view it on GitHub: https://github.com/jgm/pandoc/issues/2108#issuecomment-96454701
Opened one. Hope it is properly written.
I believe this is a resurfacing of #1866 .
I get the same thing when trying to convert to HTML or Markdown.
Reproduction instructions
Enter these commands
lysa.html