Daniel-Diaz / HaTeX

The Haskell LaTeX library.
BSD 3-Clause "New" or "Revised" License
199 stars 46 forks source link

Backticks not parsed correctly #151

Open benjaminselfridge opened 3 years ago

benjaminselfridge commented 3 years ago

See: https://tex.stackexchange.com/questions/404649/what-is-backtick-used-for

Backticks are special characters that say: "replace the following character with its character code." The current parser does not handle this functionality.

leftaroundabout commented 3 years ago

Could you give an example for something that's currently misparsed, and how it should be parsed? And, is there's any hope to fixing this once-and-for-all? Or is it just whacking one molehill (we need to keep in mind that strictly speaking it's impossible to parse general TeX, because of the self-modifying syntax)? If so, is the molehill obstructive enough to warrant treating this as a special case?

benjaminselfridge commented 3 years ago

The example in the link I posted matches the one I ran into. Trying to parse:

\begin{Verbatim}[commandchars=\\\{\}, codes={\catcode`$=3\catcode`^=7\catcode`_=8}]

Right now, I get a parse error on $, I think because the parser interprets this as math mode and expects a closing $. But the backtick is meant to sort of "escape" the $ and replace it with the corresponding ASCII code.

As for molehill-whacking, I don't know enough to classify this one. But it's currently blocking me from using this library to parse this, so I thought I'd post an issue. Perhaps you could suggest a workaround that would enable me to circumvent this error without altering the source text?

leftaroundabout commented 3 years ago

So, as I see it, having Backtick as a new special constructor to LaTeX would be reasonable

data LaTeX =
 ...
 | TeXCharQuote Char
...

Not sure about the name, maybe simply TeXBacktick Char? Of course using TeXRaw "`$" is always also an option, but I think not a good one.

One thing I can think of that should be special-cased to not be parsed this way: \verb`code snippet` and \lstinline`code snippet`.

Daniel-Diaz commented 3 years ago

I wasn't aware of this LaTeX feature. There's no other reason for this constructor to be missing. So I agree it should be added. I personally like TeXCharQuote.

I think the parser should handle it after verb, so if verb succeeds, \verb`code snippet` will be parsed as an inline verbatim.

Daniel-Diaz commented 3 years ago

I can add this over the weekend.