mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.93k stars 880 forks source link

heading inside anchor renders to invalid markdown #409

Open da1z opened 2 years ago

da1z commented 2 years ago
                    <a
                        href="some url">
                        <h3>
                            <strong>My hading</strong></h3>
                    </a>

this being converted to markdown as

[

### **My hading**

](some url)

which is not really valid. Is there workaround to make it converting to something like?

### [**MyHeading**](some url)

uuf6429 commented 2 years ago

I have the same problem, but with divs instead of headlines.

uuf6429 commented 2 years ago

My solution to this (and since I'm preprocessing the html anyway because of #415):

  document.querySelectorAll('a > div > img, a > h1 > img, a > h2 > img, a > h3 > img').forEach(img => {
    const blockElem = img.parentNode;
    blockElem.replaceWith(...Array.from(blockElem.childNodes));
  });

This assumes you are using something like jsdom or domino to preprocess the html.

Ndpnt commented 2 years ago

Same problem but with paragraph inside anchors.