godot-rust / book

Documentation and tutorials for gdext, the Rust bindings for Godot 4
Mozilla Public License 2.0
16 stars 31 forks source link

Find or write mdbook plugin to have code blocks in tables #17

Open Bromeon opened 10 months ago

Bromeon commented 10 months ago

Problem

Neither Markdown dialect seems to support fenced code blocks in tables. Something like this (here HTML):

GDScript Rust
class_name MyClass
extends Node
        
#[derive(GodotClass)]
#[class(base=Node)]
struct MyClass;
In mdbook, it's possible to use raw HTML, but it's also very unreadable.
This is the source for the above. Yes, the line breaks need to be like that to avoid trailing/leading empty lines.
```html
GDScript Rust
class_name MyClass
extends Node
        
#[derive(GodotClass)]
#[class(base=Node)]
struct MyClass;
```

Possible solutions

With mdbook, the tag <code class="language-rust"> can be used for syntax highlighting instead of <pre>.

We should research if there is an mdbook plugin achieving something similar, or write our own.

If we do it ourselves, it doesn't necessarily need to be generic and reusable. We could just start to support the above case and take it from there. Even something simple such as

```codetable left="gdscript" right="rust"
class_name MyClass
extends Node
---
#[derive(GodotClass)]
#[class(base=Node)]
struct MyClass;
```

could be transformed into the verbose HTML above.

QueenOfSquiggles commented 8 months ago

I do a lot of custom templating on my personal blog using Shopify. Maybe I can take a look into this? Obviously mdbook is probably a bit more complex than shopify + HTML but I can try lol

Bromeon commented 8 months ago

The most idiomatic would probably be mdBook preprocessors (written in Rust).

There is an official example on GitHub (linked above) and also a list of repos.

It might be quite a bit of effort though; so I totally understand if you don't want to do the whole thing 😀

QueenOfSquiggles commented 8 months ago

I got started on something simple. I'm gonna do some testing with it. My fairly low IQ approach was to check for an '@code` annotation before any table and then parse it into metadata. Funnily enough, token streaming for this is really similar to what I just made for my game library

Here's the repo I'll be building to https://github.com/QueenOfSquiggles/mdbook-code-table

QueenOfSquiggles commented 8 months ago

This could be an alternative though...probably wouldn't work for multi-line examples https://github.com/phoenixr-codes/mdbook-inline-highlighting

QueenOfSquiggles commented 8 months ago

It might be quite a bit of effort though; so I totally understand if you don't want to do the whole thing 😀

I wager as long as something is started that's better than nothing. I can't imagine godot-rust is the only rust project that would enjoy code blocks in tables

PgBiel commented 8 months ago

Just wanted to share here what I shared in the Discord server: Typst[^1] is a (rather new) document format which supports code blocks (and, really, any kind of markup) in tables. (Disclaimer: I am a contributor to Typst.)

For example, you could write

#table(
   columns: 2,
   [*GDScript*], [*Rust*],
   ```gdscript
   class_name etc
   ```,
   ```rs
   fn main() { ... }

)


The Typst compiler[^2] is written in Rust, which makes it even more compatible with this project.
However, **its compiler does not support HTML export yet** (only PDF, PNG/JPEG and SVG), so we can't use the Typst compiler to take a piece of Typst inserted in the mdbook (for example) and convert it to the equivalent HTML table.

This leads us to pandoc[^3], a tool to convert between document formats. **It supports Typst -> HTML.** However, it only supports a subset of the Typst language. Still, it's enough for our needs, as pandoc produces the following HTML from the Typst code I sent above (with e.g. a command call to `pandoc --from typst --to html5`):

```html
<table>
<tbody>
<tr class="odd">
<td><p><strong>GDScript</strong></p></td>
<td><p><strong>Rust</strong></p></td>
</tr>
<tr class="even">
<td><pre class="gdscript"><code>class_name etc
   </code></pre></td>
<td><pre class="rs"><code>fn main() { ... }
   </code></pre></td>
</tr>
</tbody>
</table>

(The only problem is that the first row in the generated HTML is not a header row, but that should be easy to deal with through simple string substitution in the generated HTML.)

Therefore, one possibility for us is to use pandoc (a tool written in Haskell, so it's better used as a binary / via CLI) to convert Typst codeblocks in the mdbook to HTML, which would let us use complex tables and other interesting features in Typst.

However, of course, that can be a bad idea for now as not many people know Typst, even though its base syntax is fairly close to Markdown and it's probably as close as we could get to a fully flexible format in this regard. Additionally, not all Typst code is handled well by pandoc, even though it has been improving.

Therefore, to show that we can avoid using any kind of Typst syntax at all, I wrote down a POC of Typst code which will automatically convert the sample codetable code block proposed at the start of this issue to a Typst table() call, which pandoc will properly convert to an HTML table - you can see it in action at this Typst Web App link:

Full Typst code (with the sample codeblock at the bottom - syntax for codeblocks in Typst is the same as in Markdown)

````typ // Maps a lang (e.g. "gdscript") to its name (e.g. "GDScript") #let langname(lang) = { let langnames = (gdscript: "GDScript", rs: "Rust", rust: "Rust") if lang in langnames { langnames.at(lang) } else { lang } } // Creates a code table. // Left is the leftmost lang (e.g. 'gdscript') to be used in the codeblock. // Right is the rightmost lang (e.g. 'rust') to be used in the codeblock. // 'code' contains code for each cell separated by lines of -------. #let codetable(left, right, code) = { assert(type(code) == type([]) and code.func() == raw, message: "Please ensure the last argument is a code block.") assert(type(left) == type(""), message: "First argument must be a string indicating the leftmost language.") assert(type(right) == type(""), message: "Second argument must be a string indicating the rightmost language.") let langs = (left, right) // Split // "a // ----- // b" // into // ("a", "b") let codeblocks = code.text.split(regex("-+\n?")) table( columns: 2, // Table header: langnames [*#langname(left)*], [*#langname(right)*], // Map each codeblock (string) to a raw block with the same lang as the original one ..codeblocks.enumerate().map(((i, code)) => raw(code, block: true, lang: langs.at(calc.rem(i, langs.len())))), ) } // Replace all 'codetable' codeblocks with the equivalent codetable() call #show raw.where(block: true, lang: "codetable"): it => { set text(font: "Linux Libertine", 1.25em) let matches = it.text.match(regex("left=([\S]+)\s+right=([\S]+) *\n((.\n?)*)")) assert(matches != none, message: "Codeblock didn't contain left=lang right=lang.") let (left, right, rest, ..) = matches.captures codetable(left, right, raw(rest, block: true)) } // Here we go! // --- PASTE THE CODEBLOCK BELOW THIS COMMENT --- ```codetable left=gdscript right=rs class_name Something extends bruh ----------- fn main() { } ``` ````

Giving the Typst code above to pandoc produces:

<table>
<tbody>
<tr class="odd">
<td><p><strong>GDScript</strong></p></td>
<td><p><strong>Rust</strong></p></td>
</tr>
<tr class="even">
<td><pre class="gdscript"><code>class_name Something
extends bruh
</code></pre></td>
<td><pre class="rs"><code>fn main() {
}
</code></pre></td>
</tr>
</tbody>
</table>

This means that we can make a preprocessor which does the following:

  1. Takes a codetable codeblock like the one proposed at the top of the issue.
  2. Pastes it under the Typst code template I sent above. (The Typst code template can also be in a separate .typ file for organization purposes, but I'm presenting simplified instructions here.)
  3. Runs pandoc to convert to HTML.
  4. Replaces the first row (<tr><td>...</td></tr>) with a header (<tr><th>...</th></tr>) in the generated HTML. (Note: the <strong>...</strong> is something I added intentionally and can be easily removed.)
  5. Replaces the entire codetable codeblock with the output of step 4.

(Step 4 will eventually not be necessary since, at some point, Typst will have header rows in tables as well - courtesy of yours truly -, but that isn't the case ATM.)

(And, of course, step 3 will eventually be replaced by a call to the Typst compiler, in Rust, when it receives support for HTML export, however that will take at least a few months still.)

We can take similar steps whenever we hit further markdown limitations. How does this sound?

[^1]: More info about Typst: https://typst.app [^2]: The Typst compiler's source code: https://github.com/typst/typst