fletcher / peg-multimarkdown

An implementation of MultiMarkdown in C, using a PEG grammar - a fork of jgm's peg-markdown. No longer under active development - see MMD 5.
Other
523 stars 55 forks source link

Parsing in captions #14

Closed jcoulombe closed 13 years ago

jcoulombe commented 13 years ago

Captions are not parsed using mmd syntax, while any special character is escaped. As a result, the final caption always prints literally the raw content of the MMD source. Therefore it seems impossible to put any formatting and, more importantly, links in captions.

$ cat caption.txt 
![animage][]

[animage]: image.png "some [#stuff][], other \cite{stuff}, and more <!--\cite{stuff}-->."

$ multimarkdown -t latex caption.txt 
\begin{figure}
\begin{center}
\includegraphics[keepaspectratio,width=\textwidth, height=.75\textheight]{image.png}
\end{center}
\caption{some [\#stuff][], other $\backslash$cite\{stuff\}, and more $<$!--$\backslash$cite\{stuff\}--$>$.}
\label{animage}
\end{figure}
fletcher commented 13 years ago

This one I have to think about. The idea of a caption is simply to have some basic text, not anything too complicated. Remember, it derives from the title attribute in HTML images, which I don't believe allow anything other than plain text.

To actually parse the code inside the caption would require a rather complicated change that I can almost guarantee won't happen.

Then, I'm left with the fact that I have to escape it, or I'll get all sorts of bug complaints about crashing XHTML or LaTeX because of special characters being unescaped inside the caption.

I'm leaning towards declaring this one "out of bounds" and suggesting that it's a good place to use hand-crafted HTML or LaTeX.

One possible exception - IF (and that's potentially a big IF) there was a way to pass a single \input{file} through the caption so that you could include that stuff in an external file for LaTeX, would that solve the issue? I don't foresee this being an XHTML problem, purely a LaTeX issue. (Which brings up the point that if something is too output specific (e.g. LaTeX only, and not XHTML), it might not belong in MMD.

This might be better handled by a post-processing script that looks for $\backslash$input{file} inside a caption, and changes it out for valid latex code (if you can use \input inside of caption.

fletcher commented 13 years ago

I tried manually setting the caption of an image to:

\caption{\input{file}}

and then creating file.tex with the contents of what went into the caption,including an \autoref.

It gave an error during the latexmk about an extra }, but by hitting return a few times it processed it and generated the proper result.

So it seems like this should work, and might be a valid alternative. I just don't think this is going to make it in as a core feature of MMD 3. But you can easily create a script to run on the tex file to reformat:

$\backslash$input{file}

into:

\input{file}

F-

jcoulombe commented 13 years ago

Actually, here is the situation for me:

  1. I cannot restrict myself to using only plain text in captions. Any figure/table that I haven't done personally from scratch needs credits to the original author using a citation, and units that need math mode, for example, are often required;
  2. the \input{} solution might be okay for a few figures/tables, but, quickly, the mmd version of a large document becomes less readable and manageable than plain latex...

If I cannot use mmd inside captions, my preferred solution is to allow plain latex there. So I found a set of pre- and post-substitutions that seem to work okay for that.

The side effect is that plain latex is allowed in the whole document.

I'll see in the long run how that works...

fletcher commented 13 years ago

I'm just afraid that this is going to be too specific of a situation to create an entire workaround for.

Nothing is set in stone, and I'm always open to convincing arguments... ;)

This might also be a good opportunity for a fork if you wanted to add this feature. It would be an easy change to always put raw text into the caption. Basically, you should be able to change:

        g_string_append_printf(out, "\\caption{");
        print_latex_string(out, elt->contents.link->title);
        g_string_append_printf(out, "}\n");

to

        g_string_append_printf(out, "\\caption{%s}\n", elt->contents.link->title);

This will change all citations to use raw text, so you have to be careful about special characters.