jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.56k stars 3.38k forks source link

Parse fail on underscores in raw TeX #439

Closed glutamate closed 12 years ago

glutamate commented 12 years ago

Pandoc 1.9.1.1 fails on this document

--Begin file test.md

Hello World

\begin{code} isWeekend Sat = True isWeekend Sun = True isWeekend _ = False \end{code}

--End file

tomn@tomn-desktop:~/tmp$ /usr/local/bin/pandoc --to=latex test.md pandoc: Error: "source" (line 6, column 12): unexpected " " expecting "{" or "\"

This worked in pandic 1.8.x. Something to do with "improved handling of underscores" in 1.9?

michaelt commented 12 years ago

Not sure why the underscore is triggering anything, but isn't code a literate haskell-specific environment? -- it still works if you rename the file test.lhs or use pandoc -f markdown+lhs A verbatim environment might be what you want, or rather, since it's markdown, a markdown codeblock

Haskell v ML
============

Here is a bit of Haskell:

\begin{code}
main = putStrLn "Hello world!"
\end{code}

Here is a bit of SML:

\begin{verbatim}  
print "Hello world!\n";
\end{verbatim}
jgm commented 12 years ago

You need to use the latex+lhs reader to enable "code" as a literal environment. Otherwise it treats "code" as an unknown environment.

Try it with 'pandoc -f latex+lhs'.

+++ glutamate [Mar 08 12 03:02 ]:

Pandoc 1.9.1.1 fails on this document

--Begin file test.md

Hello World

\begin{code} isWeekend Sat = True isWeekend Sun = True isWeekend _ = False \end{code}

--End file

tomn@tomn-desktop:~/tmp$ /usr/local/bin/pandoc --to=latex test.md pandoc: Error: "source" (line 6, column 12): unexpected " " expecting "{" or "\"

This worked in pandic 1.8.x. Something to do with "improved handling of underscores" in 1.9?


Reply to this email directly or view it on GitHub: https://github.com/jgm/pandoc/issues/439

glutamate commented 12 years ago

It is true that Pandoc treats different latex environments differently in the Raw TeX mode. For instance, this parses

Hello World

\begin{verbatim} isWeekend _ = False \end{verbatim}

but this results in a parse error:

Hello World

\begin{verbatm} isWeekend _ = False \end{verbatm}

And yet I could have defined a "verbatm" environment in my template.

Besides, there are very legitimate reasons to put "\begin{code}" blocks as raw TeX even if doing literate Haskell. For instance, there is no beamer+lhs writer, so if I want literate haskell in beamer, i have to put the code blocks directly.

glutamate commented 12 years ago

Thanks John, adding -f solves the problem. I still think there is some undocumented behaviour here in the handling of raw TeX.

glutamate commented 12 years ago

and weird behaviour with underscores:

this is OK:

Hello World

\begin{verbatm} isWeekend = False \end{verbatm}

but this is not

Hello World

\begin{verbatm} isWeekend _ = False \end{verbatm}

I get the parse error in the second case irrespective of whether I run this with -R

jgm commented 12 years ago

Is this a typo? Try \begin{verbatim} instead of {verbatm}.

+++ glutamate [Mar 08 12 11:27 ]:

and weird behaviour with underscores:

this is OK:

Hello World

\begin{verbatm} isWeekend = False \end{verbatm}

but this is not

Hello World

\begin{verbatm} isWeekend _ = False \end{verbatm}

I get the parse error in the second case irrespective of whether I run this with -R


Reply to this email directly or view it on GitHub: https://github.com/jgm/pandoc/issues/439#issuecomment-4400321

glutamate commented 12 years ago

No it is not a typo. I am merely pointing out that occasionally you need custom latex environments. (For instance, code in beamer to be typeset by lhs2TeX; or using environments defined in packages like fancyvrb etc, imported in the template) Instead of verbatm it could be \begin{foo} etc. These all behave differently depending on whether there is an underscore in the raw TeX, but they didn't under Pandoc 1.8.

If I use the "verbatim" environment there is no parse error with underscores. I guess verbatim is a recognized environment.

jgm commented 12 years ago

+++ glutamate [Mar 08 12 11:15 ]:

It is true that Pandoc treats different latex environments differently in the Raw TeX mode. For instance, this parses

Hello World

\begin{verbatim} isWeekend _ = False \end{verbatim}

but this results in a parse error:

Hello World

\begin{verbatm} isWeekend _ = False \end{verbatm}

And yet I could have defined a "verbatm" environment in my template.

Pandoc does need to parse literal and non-literal environments differently. For example:

\begin{myenv} $2+2=4$ \end{myenv}

will be parsed as including a math inline if non-literal, but not if literal. To make the distinction, pandoc uses a list of common literal environments. You're right that it won't know about custom literal environments. I don't see an easy way around this limitation.

Besides, there are very legitimate reasons to put "\begin{code}" blocks as raw TeX even if doing literate Haskell. For instance, there is no beamer+lhs writer, so if I want literate haskell in beamer, i have to put the code blocks directly.

If that's your only example, then the correct response is to add a beamer+lhs output format, which could easily be done.

glutamate commented 12 years ago

Yes, the beamer / literate haskell example is the only case I can think about right now (or need in the foreseeable future).

Thanks John. I'll close the ticket.

jgm commented 12 years ago

+++ glutamate [Mar 08 12 13:34 ]:

No it is not a typo. I am merely pointing out that occasionally you need custom latex environments. (For instance, code in beamer to be typeset by lhs2TeX; or using environments defined in packages like fancyvrb etc, imported in the template) Instead of verbatm it could be \begin{foo} etc. These all behave differently depending on whether there is an underscore in the raw TeX, but they didn't under Pandoc 1.8.

I think you must be mistaken here. The underscore wouldn't have caused a parse error under pandoc 1.8, because I had a 'catch-all' parser that just accepted illegal characters. But these environments wouldn't have been correctly parsed as literal environments; they would have been treated as unknown environments and their contents interpreted.

jgm commented 12 years ago

I've just added beamer+lhs as an output format.

+++ glutamate [Mar 08 12 13:38 ]:

Yes, the beamer / literate haskell example is the only case I can think about right now (or need in the foreseeable future).

Thanks John. I'll close the ticket.


Reply to this email directly or view it on GitHub: https://github.com/jgm/pandoc/issues/439#issuecomment-4403247

glutamate commented 12 years ago

Thanks; it works!