mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.62k stars 870 forks source link

Fix anchors containing block elements #419

Open Ndpnt opened 2 years ago

Ndpnt commented 2 years ago

Fixes https://github.com/mixmark-io/turndown/issues/409

aaronmfparr commented 1 year ago

This is a very simple fix and solves my problems, so I am all for a merge here.

What's the hold up?

domchristie commented 1 year ago

Unfortunately this change doesn't result in the correct markdown. For example, the following HTML:

<a href="http://example.com/heading"><h1>heading</h1></a>

gets converted to:

[heading
=======](http://example.com/heading)

and so when converted back to HTML is:

<p><a href="http://example.com/heading">heading
=======</a></p>

This is quite a complex issue, and becomes trickier when multiple block elements are wrapped in a <a> (see https://github.com/mixmark-io/turndown/issues/313#issuecomment-605149985). Ideally the HTML should be changed so that the <a> only wraps text. So currently your best bet is to mutate the HTML to fix these issues before passing it in to Turndown.

aaronmfparr commented 1 year ago

thanks for responding to this. hopefully we'll see some movement.