mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.52k stars 864 forks source link

Keep tags on link as HTML #454

Closed alexkaysersnow closed 4 months ago

alexkaysersnow commented 6 months ago

I'm trying to solve a case where we want to add a HTML tag in the markdown link and keep it as an HTML tag. but Turndown is always translating the < to &lt; for exemple.

How to keep the html tag in the label of the link?

For this text: <a href='https://www.google.com' class='LinkOut'>Go to Google</a> The expected output is: [Go to Google <Icon icon="fa-external-link" size="sm" />](https://www.google.com)

I've tried to add the tag in the keep instruction but it doesn't accept custom tags TurndownService.keep(['Icon']) --> Type '"Icon"' is not assignable to type 'keyof HTMLElementTagNameMap'.ts(2322)

I've found a closed issue here pointing to create a rule to keep, but it was unsuccessful as well.

Here is the rule:

TurndownService.addRule('external-link', {
        filter: function (node) {
            let aclass = (node as HTMLElement).getAttribute("class")
            return node.nodeName === 'A' && !!aclass?.includes('LinkOut')
        },
        replacement: function (content, node) {
            let url = (node as HTMLElement).getAttribute("href")
            if (!url?.startsWith('http'))
                return content
            return `[${content} <Icon icon="fa-external-link" size="sm" />](${url})`

            // also didn't work
            // return `[${content} &lt;Icon icon="fa-external-link" size="sm" /&gt;](${url})`
        },
    });
martincizek commented 4 months ago

I've found a closed issue here pointing to create a rule to keep, but it was unsuccessful as well.

That's unrelated. You just want to create a custom rule for creating output, not keep some tags in the input.

I've tried it in node.js and it just works. Please reopen this if I am missing something.

const TurndownService = require('turndown');
const turndownService = new TurndownService();

turndownService.addRule('external-link', {
    filter: function (node) {
        const aclass = node.getAttribute("class")
        return node.nodeName === 'A' && aclass === 'LinkOut'
    },
    replacement: function (content, node) {
        let url = node.getAttribute("href")
        if (!url?.startsWith('http'))
            return content
        return `[${content} <Icon icon="fa-external-link" size="sm" />](${url})`
    },
});

const str = `<a href='https://www.google.com' class='LinkOut'>Go to Google</a>`;
console.log(turndownService.turndown(str)); // [Go to Google <Icon icon="fa-external-link" size="sm" />](https://www.google.com)