yuin / goldmark

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.
MIT License
3.68k stars 255 forks source link

Question: How to modify and write back AST to original markdown #150

Closed g-pavlov closed 4 years ago

g-pavlov commented 4 years ago

I'm looking for a hint here, not submitting an issue or feature request yet. I know that the original goal here is to produce HTML but I am considering goldmark to modify the contents of markdown files for my use case. For example, I'd like to modify link destinations in the document. I can easily use the Walker function as callback to AST nodes traversal and figure links and raw html anchors. What I'm struggling with is how to:

  1. make a change (e.g. update a link node Destination)
  2. write back the AST tree into markdown ,i.e. do something like render for markdown and not for HTML.

Ideally, I want to end up with minimal modifications to the original document, e.g. just the link strings in my case, preserving everything else as-is. But I could live with a slightly modified variant if necessary too.

The closest I could get here is to this gist and with this test input: "This is `inline element`\n\n*empahsized text*\n## Heading 2\n[GitHub](\"https://github.com\") ![ImgTitle](\"https://somehwere.org/someurl\")\n ## Another heading 2\n <p><a href=\"https://github.com\">alabala</a> <img src=\"../images/logo.png\"></p>" I get this output:

This is `inline element`*empahsized text*Heading 2[GitHub]("https://github.com") ![ImgTitle]("https://somehwere.org/someurl")Another heading 2 <p><a href="https://github.com">alabala</a> <img src="../images/logo.png"></p>

However, changes to Destination are not reflected and newlines are lost.

Do you have suggestions how to approach this best if it's possible at all?

yuin commented 4 years ago

It is hard to write back original markdown.

AST is an abstract syntax tree, so AST does not keep all concrete information about source text.

But it can be done that you write AST back to formatted markdown. https://github.com/Kunde21/markdownfmt

g-pavlov commented 4 years ago

Thanks! It seems that using markdownfmt will do for now as long as precise match with the original markdown is not a hard requirement. Maybe some (abstract) reference for the original format could help using the right writer on a node and draw closer to original markdown. I hope I can propose something more specific in future and discuss. And thumbs up for the good work on goldmark!