mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.62k stars 870 forks source link

`keep` is not working for the tag names that starts with capital letters #396

Closed Ki6an closed 2 years ago

Ki6an commented 3 years ago

var TurndownService = require('turndown')

var turndownService = new TurndownService()

turndownService.keep(['Square'])
const markdown = turndownService.turndown('<p>Hello <Square>world</Square> </p>')

console.log(markdown)

result is :


Hello world

not


Hello <Square>world</Square>

however, it works fine with the square

the following also does not work.


turndownService.keep(['square'])
const markdown = turndownService.turndown('<p>Hello</p> <square/>')
>>>  Hello
martincizek commented 2 years ago

There can be unexpected consequences of changing the behavior, so I've updated the docs for now.

Regarding to your second example, the issue is that this is because square is not defined as "meaningful when blank". This list is unfortunately hardcoded, but you can work around this by redefining the blankReplacement if it is really important.

kalyan02 commented 2 years ago

Thanks for the tip about blankReplacement.

Thought I'll leave this code snippet here for someone looking in the future.

Ckeditor5 + markdown strips out oembed tags and I wanted to preserve them. However the oembed tags are actually empty so I modified the blank rule to get it to work.

function oembedRule( turndownService ) {
    var originalBlankRule = turndownService.rules.blankRule.replacement;
    turndownService.rules.blankRule.replacement = function( content, node ) {
        if (node.nodeName == "OEMBED" || node.nodeName == "FIGURE") {
            return node.outerHTML;
        }
        return originalBlankRule(content, node);
    }
};
turndownService.use(oembedRule)

Cheers