wooorm / markdown-rs

CommonMark compliant markdown parser in Rust with ASTs and extensions
https://docs.rs/markdown/1.0.0-alpha.21/markdown/
MIT License
950 stars 53 forks source link

Markdoc for Markdown-RS #133

Closed rorychatterton closed 1 month ago

rorychatterton commented 2 months ago

Is there interest in adding Markdoc-like features to markdown-rs via an extension?

Context I'm working on a headless CMS project that will consume commonmark content. We want to be able to offer additional tags/attributes beyond the core commonmark spec, but do not want to go down the MDX path as we don’t want our customers to be able to write abstract JSX inline.

Why:

Why Not:

Does it make sense to add a Markdoc extension to Markdown-rs?

I'm also open to hearing alternative options to solve the problem in a diferent way.

Writing Transformers and AST isn't my forte, however this is something I am willing to look into contributing (It'll take me a while to turn around), or alternatively, sponsoring development work at a later state once it's on our critical path & have the funds.

wooorm commented 2 months ago

Hi Rory!

We want to […] offer additional tags/attributes […] but do not want […] JSX […]

Why? Tags/attributes is exactly what JSX also is? Whether <custom-element key=value> or <Jsx key="value"> or :directive[key=value] or {% markdoc key=value /%}.

Markdoc-like

As I understand it markdoc is several things. A framework too, which is out of scope for this project. Then, it seems to be very close to a template language. Why not use a template language?

Allows for extending commonmark with additional tags and attributes while dissalowing abstract code logic inline with Markdown, allowing for the documentation to remain 'declarative'

As for “abstract code”, I doubt it? I read that there are functions and variables in markdoc too?

It has a good community around it.

I wonder about this, Stripe is very popular so when they share something, lots of people get excited. But I always wonder with how long these things stay maintained. People leave. Companies stop funding things…

Commonmark Directives might make this redundant (not sure if there’s an ETA, and I don’t think it sorts looping and conditionals)

They’re not supported here yet. Just need someone to make them: https://github.com/wooorm/markdown-rs/issues/57. But they’re in micromark and markdown-it and GHs syntax highlighter (which I maintain). I think that getting them supported in tools is enough: similar to how GFM is ubiquitous.

As for looping/conditionals: they can be implemented with anything: markdoc/directives/MDX/custom-elements in HTML.

While Parsing & Transforming are separate steps in the existing Markdoc Javascript parser, it is incompatible with MDast, […]

Why? If it has a grammar, like the simple schema you shared, then it could work just like directives, and be nodes in the AST?

The existing Markdoc AST code is pretty confusing, because it effectively parses twice - once with markdown-it, then again with its own pegjs definitions.

Interesting—and sad—to hear. Do you know why that is? I can understand it if markdoc was a template language. But they say they aren’t?


One more point for “why not”: markdoc (and mdxjs-rs, and the @mdx-js org) are very focussed on JavaScript. They embed JavaScript. You can pass functions / components in JavaScript. Here we are in Rust. We want people to pass “components” in Rust. So maybe what we embed, can also be more like Rust, instead of JavaScript?

Luckily, the tools for MDX are actually made for that. Someone can hook in, to parse Rust. Or anything really. In both markdown-rs or in micromark.

I'm also open to hearing alternative options to solve the problem in a diferent way.

Answering this, has more to do with: why do you want this? What content do you already have? What will you have? Who are the authors? What do you use now? I’d typically push towards directives. I imagine them being supported by more things for longer. Markdoc to me seems too much tied to one product made by Stripe.

Writing Transformers and AST isn't my forte, however this is something I am willing to look into contributing (It'll take me a while to turn around), or alternatively, sponsoring development work at a later state once it's on our critical path & have the funds.

ASTs are amazing! It’s powerful knowledge that carries through to so many other things. So, would recommend getting into that when you ever get the chance. And, who is “our”? When would something be on a critical path? I’m definitely up for freelancing to do some of the open issues here on markdown-rs!

rorychatterton commented 2 months ago

Hey Wooorm,

I really appreciate your response!

Apologies about the intermingling of 'MDX', and 'the MDX implementation for Javascript'. It's hard to separate the two in my mind, and understand that it could be different in Rust.

As I understand it markdoc is several things. A framework too, which is out of scope for this project.

It says it's a framework, but it's really just a templating language with marketecture spin. It's effectively just what is in the spec.

Why? Tags/attributes is exactly what JSX also is?

It's not the syntatic sugar for tags (they all have them), but rather added context of intermingling JSX. Both can define custom tags/attributes, loop etc, but Markdoc is more of a restrictive DSL, while JSX is more like embedding Javascript inside the codebase.

In essense, it's something like:

Feature Markdoc MDX
Registering Components Passed to the renderer before execution Imported in the MDX file
Restricting Components Registered at build time Optionally Registered at build time
JavaScript Execution No direct JavaScript execution in the document Allows embedding and execution of arbitrary JavaScript (e.g const foo = naughty_function()).
I suspect, depending on the rendering engine and its sanitisation, it might be possible to circumvent the component import restrictions using javascript dynamic imports, but that's just a hunch and unvalidated
Separation of Concerns Strict separation between content and logic Blends content and logic more closely

In the case I'm thinking through, I basically want to be able to add additional tags to render thingies, for the customers writing the markdown content, without allowing them to define their own.

I read that there are functions and variables in markdoc too?

They're a bit more limited. You write the functions externally and pass them in. You can't write your own.

The variables are basically just for simple loops, if statements, and repeated content. E.g. you might pass in some metadata such as publish date, and use it elsewhere.

You can't, for example, curl some content and then execute it.

Community...

Fair call! I think we've all been burned by it in the past.

Taking a step back, I'm not wedded to Markdoc being the answer. Really, my concern has alot more to do with keeping 'code', separate from the documentation itself.

Directives & Looping

Looks like I'll need to take a closer look. My bad!

mDast Compatibility

I thought mDast had to stay in line with the commonmark spec, and accepted extensions. I didn't think you were 'allowed to' extend it with other formats, so that it could be cleanly accepted between implementations.

If that isn't a constraint, cool.

looped rendering: Interesting—and sad—to hear. Do you know why that is?

Honestly no idea, I can only speculate. I'm guessing

shrugs

Answering this, has more to do with: why do you want this? What content do you already have? What will you have? Who are the authors? What do you use now?

I've mostly answered this above, but to summarise:

What content do you already have? What will you have? Who are the authors? What do you use now?

Without going too far into what we're building. I want taking a space that is predominantly loosely managed word documents, powerpoints, confluence sites, markdown files, and build a structured content platform that is tailored towards a specific, technical domain.

Calling it a CMS is a bit missleading (but was the closest example I could give) - it manages and renders content, but it does a whole bunch of other stuff to source, aggregate, and present content for the customers.

ASTs are amazing!

I hear you, it's a gap I've wanted to close for a while.

And, who is “our”? When would something be on a critical path?

Small Australian company. Only a couple of us, self-bootstrapped. We're spending 50% of our time consulting, 50% on a moonshot product.

When would something be on a critical path?

Maybe 6 months from now? Our books are pretty full consulting for now and would need to make some initial cash.

I’m definitely up for freelancing to do some of the open issues here on markdown-rs!

Yeah 100%. Totally get these things aren't free.

You've given me a heap to think about.

For now, I'm going to park the markdoc idea and investigate MDX with Rust & Restrictions, or adding Directives.

I don't have the money for it now, but I'll loop back in a few months. Maybe an opportunity to:

I grabbed your email from your website and will drop you a line a bit later.

I'll leave this thread open for a couple more days incase anybody else wants to contribute or feels strongly towards it, otherwise will close.