jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.83k stars 3.39k forks source link

Parse LaTeX Parallel Environment #4512

Open schrieveslaach opened 6 years ago

schrieveslaach commented 6 years ago

Pandoc should be able to parse Parallel environments, such as:

\begin{Parallel}{0.5\linewidth}{0.5\linewidth}
    \ParallelLText{Text on the left side}
    \ParallelRText{Text on the right side}
\end{Parallel}

pandoc -f latex -t html should produce following content:

<div>
    <div style="width: 50%; float: left">Text on the left side</div>
    <div style="width: 50%; float: left">Text on the right side</div>
</div>
jgm commented 6 years ago

I wonder if this is too special-case. We can't support every last package on CTAN without things getting too complicated. And here, parsing into a Div structure with a style attribute would really only help for HTML output.

jgm commented 6 years ago

Perhaps what we need is the ability to tell pandoc to create Div or Span environments from certain latex commands and environments. Then you can handle them in a filter.

See #3145.

It's a bit tricky, though. Environments can take optional arguments, as in your example. Sometimes they have verbatim contents. How does pandoc know how to parse the thing?

Thinking out loud: What if the latex reader knew how to handle a few special macros, like

PandocDiv PandocSpan PandocCode PandocCodeBlock

Then you could define a latex macro to convert your target latex environment into one of these, and pandoc could take it from there. E.g. using xparse

\DeclareDocumentEnvironment{Parallel}{m}{m}
  {\begin{PandocDiv}[Parallel]}
  {\end{PandocDiv}[Parallel]}
\newcommand{ParallelLText}[1]{\PandocSpan[ParallelLText]{#1}}
\newcommand{ParallelRText}[1]{\PandocSpan[ParallelRText]{#1}}

(Here I assume you aren't usepackage'ing parallel, or you'd need to renew the environment.) The idea is that you'd put this in your header to tell pandoc how to interpret this latex environment in Pandoc terms, and pandoc would give you a native pandoc structure, which you could manipulate with a filter.

schrieveslaach commented 6 years ago

I tried to parse the Parallel environment (see PR) and its almost done. With some guidance I might be able to finish it. Maybe, we can use it or not…

I'm not sure about the special pandoc environments because it would not give me the ability to do more that by declaring a custom LaTeX command in a separate file which defines the environment. For example, I'm not sure if I'm able to add a style or anything else.

I'm imagine to parse these command in pandoc (along the style) and the writers (e. g. OdtWriter) are able to use the CSS style information to create an appropriate document. For example, if the OdtWriter detects CSS floats it generates a similar looking ODT document, such as that my PDF and ODT look similar.