moonshot-nagayama-pj / public-documents

Public collaboration on documentation by the Moonshot Nagayama team.
https://qitf.org/en/moonshot-nagayama/
Other
0 stars 0 forks source link

PDF, epub, etc. generation for public-documents #1

Open auspicacious opened 2 months ago

auspicacious commented 2 months ago

In https://github.com/moonshot-nagayama-pj/testbed-meta/issues/17 we created a public-documents repository.

However, the documents in this repository are written in Markdown, and currently only viewable through GitHub's builtin rendering (unless you render them locally).

It should be possible to set up GitHub Actions to use Pandoc to generate several different formats from this source. PDF and epub could be good targets to start with.

Most likely we can upload these into GitHub Releases.

This is also an opportunity to add linting tools to check for consistent styling in documents.

Finally, the IETF does have some tools for generating RFC and Internet Drafts from Markdown. It would be worthwhile to see what is available there before continuing.

Acceptance criteria

auspicacious commented 2 months ago

I haven't looked into the IETF tools, but I ran the document through Pandoc today to see what would happen.

Pandoc generates a decent PDF, but it won't show images unless the images are declared using Markdown markup. It doesn't recognize <img> tags. There are some Pandoc-specific extensions to Markdown that make the images look more like figures in a scientific paper. https://pandoc.org/chunkedhtml-demo/8.17-images.html

Pandoc can also generate epub files from Markdown. It tries to translate the TeX into MathML, but often fails. The documentation claims that there's a couple of ways to generate images from the TeX, but after installing Gladtex into a virtualenv and running Pandoc inside that, I wasn't able to get it to work right away (and webtex just sort of hung).

https://pandoc.org/epub.html

Quite possibly I wasn't using the Gladtex option correctly: https://pandoc.org/MANUAL.html#option--gladtex

auspicacious commented 1 month ago

IETF asks for Internet-Draft diagrams to be submitted in both ASCII art and SVG formats.

The kramdown toolchain recommended for use when writing I-Ds in Markdown supports math in a similar way to Pandoc, but there's not a lot of support for math in I-Ds themselves, especially because plain-text RFCs are still one of the required output versions. According to the kramdown-rfc syntax reference:

Since 1.0.30, kramdown-rfc converts display math into a crude ASCII art form (using the tex2mail tool, if available). This is probably useful only for a minority of applications. There is also no way to do embedded math (i.e., within a paragraph).

The timing regimes document is very math-heavy and has a lot of PNG diagrams, so it would either require significant work to convert to I-D format, or we can generate PDFs using pandoc.