wooorm / markdown-rs

CommonMark compliant markdown parser in Rust with ASTs and extensions
https://docs.rs/markdown/1.0.0-alpha.21/markdown/
MIT License
950 stars 53 forks source link

`markdown::to_hast` support #27

Open pd4d10 opened 2 years ago

pd4d10 commented 2 years ago

Would it be supported in the future plan?

wooorm commented 2 years ago

Feel free to take the code for that now from here: https://github.com/wooorm/mdxjs-rs/blob/main/src/mdast_util_to_hast.rs.

I don’t really know how to best expose those parts. And you will likely need more parts (hast-util-to-html). I don’t think they make a lot of sense here. Depends on how people want to use these things!

kwonoj commented 1 year ago

Just fyi, I'm also in need of similar api exposure from mdxjs-rs, though I'm also not sure final shape of surface yet (under experimenting now).

@wooorm Is this something acceptable in upstream mdxjs-rs PR to expose few interfaces as public? or the recommendation is copy-code-as needed?

wooorm commented 1 year ago

Can you tell me more about what you are working on?

My previous comment still stands: you will likely need more things. I’d probably want to expose several unist, mdast, and hast utilities, each in a different monorepo for their AST?

Keeping it hidden for now means that some of the AST can still be improved. I’d particularly first like to:

After those two, I think it should be a stable API that can be exposed either from the crates currently or later from separate packages

kwonoj commented 1 year ago

Sure, the current work I'm trying to do is integrate mdx support in turbopack (https://github.com/vercel/turbo).

I'm still in the process of figuring out what's necessary piece, so I may wrong, but so far I came to conclude it requires 2-stage process for parse(input, option) -> AST / transform(AST, option) as separate. This requires some internal access to existing mdxjs-rs to split things up and create separate api surfaces.

Again, please note this is not a final form and reason I mentioned though I'm also not sure final shape of surface yet (under experimenting now).. I would like to get more clarity before making any actual upstream effort.

wooorm commented 1 year ago

Cool, interesting!

so far I came to conclude it requires 2-stage

You mean that turbopack enforces this, right? So it has several hooks for plugins (likely also serialize/stringify?).

kwonoj commented 1 year ago

Do you have a link to some docs on these, or some type definitions perhaps

afaik no clear docs around this unfortunately. I'm reading existing implementation to support other assets (js, css) - i.e there is a fn parse definition for ecma support https://github.com/vercel/turbo/blob/main/crates/turbopack-ecmascript/src/parse.rs#L129 returns parsed result, and it is being transforme lazely elsewhere.

the big question for these signatures is: what does transform want to do?

This is great question and I do not have concrete answer yet, I'd like to play with my wip first to see how it looks like.

wooorm commented 1 year ago

For a first MVP, I’d go with just a parse hook:

For a second MVP, I’d use a fork that specifies a parse hook with: https://github.com/wooorm/mdxjs-rs/blob/d0a15b116d6a09e98149188eb190537a2faeac9e/src/lib.rs#L113-L114 A transform hook with: https://github.com/wooorm/mdxjs-rs/blob/d0a15b116d6a09e98149188eb190537a2faeac9e/src/lib.rs#L115-L122. And a serialize hook with: https://github.com/wooorm/mdxjs-rs/blob/d0a15b116d6a09e98149188eb190537a2faeac9e/src/lib.rs#L124.

Assuming we have MDX plugins soon, you could hopefully access those from options in the transform hook, and apply some of them on the markdown AST, some on the HTML AST, and some at the end in the JS AST.

I’d be open to splitting these steps in 3 functions, and exposing those from mdxjs-rs later if your MVP 2 works?