jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.03k stars 3.35k forks source link

pandoc 1.13.2 regression with \newcommand #1835

Closed klausman closed 9 years ago

klausman commented 9 years ago
$ cat simple.tex 
\newcommand{\separator}{\vspace{4em}}

one line

\separator

another

$ pandoc -f latex -t plain < simple.tex 
pandoc: 
Error at "source" (line 7, column 1):
unexpected '\n'
another

This worked just fine with 1.13.1 and 1.12.2.1.

1.13.1:

$ pandoc --version 
pandoc 1.13.1
Compiled with texmath 0.8.0.1, highlighting-kate 0.5.11.1.
Syntax highlighting is supported for the following languages:
[...]
$ pandoc -f latex -t plain < simple.tex 
one line

another

$

My first suspicion was that pandoc expects {} after a macro, but adding that just gives:

Error at "source" (line 5, column 21):
unexpected "{"
expecting letter or new-line
\separator{}
                ^

Note how the problem marker (caret) also is way off.

jgm commented 9 years ago

I have no idea why the behavior would be different between 1.13.1 and 1.13.2, as the LaTeX reader changes were minimal and the texmath version you used seems to be the same.

Some experimentation reveals that the problem arises when the contents of the expanded macro are something that pandoc would parse as block-level content, rather than inline-level.

So e.g.

\newcommand{\x}{\paragraph{hi}}

\x
pandoc: 
Error at "source" (line 4, column 1):
unexpected end of input

It doesn't matter if the macro has arguments or not.

klausman commented 9 years ago

For unrelated reasons, I rebuilt my whole stack of pandoc-related Haskell libraries/binaries and this problem is gone now. No idea how I introduced it in the first place. Sorry for the noise.

ghost commented 9 years ago

Same issue is seen on Windows.

pandoc 1.13.2
Compiled with texmath 0.8.0.1, highlighting-kate 0.5.11.1.
Syntax highlighting is supported for the following languages:
    abc, actionscript, ada, agda, apache, asn1, asp, awk, bash, bibtex, boo, c,
    changelog, clojure, cmake, coffee, coldfusion, commonlisp, cpp, cs, css,
    curry, d, diff, djangotemplate, dockerfile, dot, doxygen, doxygenlua, dtd,
    eiffel, email, erlang, fasm, fortran, fsharp, gcc, glsl, gnuassembler, go,
    haskell, haxe, html, ini, isocpp, java, javadoc, javascript, json, jsp,
    julia, latex, lex, lilypond, literatecurry, literatehaskell, lua, m4,
    makefile, mandoc, markdown, mathematica, matlab, maxima, mediawiki,
    metafont, mips, modelines, modula2, modula3, monobasic, nasm, noweb,
    objectivec, objectivecpp, ocaml, octave, opencl, pascal, perl, php, pike,
    postscript, prolog, pure, python, r, relaxng, relaxngcompact, rest, rhtml,
    roff, ruby, rust, scala, scheme, sci, sed, sgml, sql, sqlmysql,
    sqlpostgresql, tcl, tcsh, texinfo, verilog, vhdl, xml, xorg, xslt, xul,
    yacc, yaml, zsh
klausman commented 9 years ago

@bechamp Note that for me, rebuilding the whole stack of pandoc and dependencies made the problem go away for me. Have you tried that?

lierdakil commented 9 years ago

I can repro on pandoc-1.13.2 and master (a.k.a. 1.13.3), so I think it's a valid bug. I suggest reopening.

ghost commented 9 years ago

@klausman I saw your workaround but did not try. I worked around the issue in this particular case just by commenting out the offending \newcommands. Stuck using a Windows machine for this project and not really up for rebuilding at the moment and Haskell is uncharted territory for me. Its not a major issue, but an issue when going from LaTeX to reST for porting specs to Sphinx. Otherwise, Pandoc is great.

nkalvi commented 9 years ago

Pardon me, but according to README, don't you have to enclose it in '$...$'?

\newcommand{\separator}{\vspace{4em}}

one line

$\separator$

another

Is output as (pandoc -t plain newcommand.tex):

one line

${\vspace{4em}}$

another

https://github.com/jgm/pandoc/blob/e0d234e54d18a82a7c90aa3946f890140e200051/README

LaTeX macros

Extension: latex_macros

For output formats other than LaTeX, pandoc will parse LaTeX \newcommand >and \renewcommand definitions and apply the resulting macros to all LaTeX math. So, for example, the following will work in all output formats, not just LaTeX:

\newcommand{\tuple}[1]{\langle #1 \rangle}

$\tuple{a, b, c}$

In LaTeX output, the \newcommand definition will simply be passed unchanged to the output.

lierdakil commented 9 years ago

@nkalvi no. It's latex reader we're talking about. We are not talking about markdown, like, at all.

nkalvi commented 9 years ago

@lierdakil sorry, my test was based on @klausman's first post with example (since the file had .tex, it was processed by the LaTex reader, without explicit -f latex).

From what I understand LaTex reader will apply the macros only if they're enclosed in '$...$'.

nkalvi commented 9 years ago

BTW, I found this https://github.com/jgm/pandoc/issues/308 from @jgm

Command macros work everywhere since 1.10. (Not yet environment macros.) Closing this.

lierdakil commented 9 years ago

@nkalvi this shouldn't be the case. \newcommand is a valid latex command, and latex reader should not throw its hands up on a valid input now, should it? It doesn't have to expand anything. It should at least read it.

And as you quote, command macros should just plain work. And have been for a while now.

nkalvi commented 9 years ago

@lierdakil The LaTex reader was complaining about \separator not the \newcommand.

lierdakil commented 9 years ago

@nkalvi point still stands, it's perfectly valid latex.

nkalvi commented 9 years ago

Yes, I agree that it is perfectly valid syntax. I think that's why (partially) #308 was raised.

I was just pointing out that according to the docs, macro expansion is applied in math context. Perhaps @jgm will clarify or remove this restriction.

lierdakil commented 9 years ago

Ok, git bisect points us to 58e4e4a608de489f993c36234048014d93b39116

mpickering commented 9 years ago

I think this is #1866 ?

lierdakil commented 9 years ago

@mpickering Yes, it apparently is. I didn't notice it.

nkalvi commented 9 years ago

I don't know Haskell, but I'm wondering whether this is coming from parsing inline commands: https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L500

Now I'll be quiet and watch :)