Closed Witiko closed 2 years ago
Adding a jgm/pandoc reader for reconstructing a jgm/pandoc AST from a TeX AST would be also benefitial. This reader would be more of a plumbing tool for restoring a document from the intermediary TeX AST files for cases where the original sources are unavailable. However, there does not seem to be any Lua API for readers, so we either need to abuse the AST and create a Lua filter, or we would need to create a Haskell reader and contribute it to jgm/pandoc, as discussed in https://github.com/jgm/pandoc/issues/1541#issuecomment-894786097. Similarly to a Haskell reader, we could also contribute a Haskell writer that would replace the Lua writer in the long run and would be maintained as a part of jgm/pandoc.
An exhaustive specification of the elements of Pandoc's AST format is available on Hackage.
The full list of Lua functions reserved for Lua writers is available in jgm/pandoc's src/Text/Pandoc/Writers/Custom.hs.
[...] create a Haskell reader and contribute it to jgm/pandoc, as discussed in jgm/pandoc#1541 (comment). Similarly to a Haskell reader, we could also contribute a Haskell writer that would replace the Lua writer in the long run and would be maintained as a part of jgm/pandoc.
@drehak I have created a development environment for Pandoc using Docker at witiko/pandoc-devenv. We can use it to develop Haskell readers and writers for Pandoc without littering our base OS with Haskell. Those who want to litter their base OS can take inspiration in our Dockerfile.
A preliminary analysis by for the implementation has been authored by @drehak and published in the CSTUG Bulletin 2021/1-4 (landing page, PDF).
@drehak We should aim to close this issue before milestone 2.15.0 (due on March 31), since the defense of your student project will likely take place before then and also because we'd like to publicize your proof of concept in a journal article for TUGboat 43:1 (also due on March 31, see #120).
\pandoc*
commands less dependent on the internal format of the Markdown package and more faithful to the structure of the Pandoc AST. A good guiding principle is that we should be able to write a reader that allows a round trip from AST to plain TeX and back. Practically speaking, this means that we will be moving a lot of the logic out of the Lua reader and into the plain TeX code that rewrites the \pandoc*
commands into the \markdown*
commands.
Currently, a user of the Markdown package is restricted in their choice of syntax extensions to the ones provided by the Lua parser implemented in
markdown.lua
. To provide experimental ground for implementing new syntax extensions, support for the internal abstract syntax tree (AST) format of the jgm/pandoc converter will be added.Currently, jgm/pandoc can be used to provide conversion from various input formats to Markdown:
The Markdown package can then be used to convert the Markdown document to the TeX abstract syntax tree format (TeX AST) produced by the Markdown package:
This representation can then be typeset. This is useful, but limited to the Markdown syntax extensions supported by our Lua parser.
The plan is to provide a jgm/pandoc Lua writer (see jgm/pandoc issues 4341 and 1541 for futher information) that will directly convert the jgm/pandoc AST to the TeX AST, circumventing the Lua parser altogether:
Adding initial support for a new syntax extension already supported by jgm/pandoc will then be as easy as adding a new procedure to the writer and defining the corresponding
\markdownRenderer…
macros. Full support can be added later by extending our Lua parser.TODOs: