Closed Offerel closed 3 years ago
Can't reproduce it. Can you double-check the actual contents of your pastedHTML
before it is passed to turndown?
With current Turndown version from npm and Node.js:
const TurndownService = require('turndown');
let options = {
headingStyle: 'atx',
hr: '-',
bulletListMarker: '-',
codeBlockStyle: 'fenced',
fence: '```',
emDelimiter: '*',
strongDelimiter: '**',
linkStyle: 'inlined',
linkReferenceStyle: 'full',
collapseMultipleWhitespaces: true,
preformattedCode: true,
};
let turndownService = new TurndownService(options);
turndownService.keep(['kbd', 'ins']);
html = '<kbd>STRG</kbd> <code>test</code>';
htmlUnbalancedPTag = '<kbd>STRG</kbd> <code>test</code></p>';
console.log(turndownService.turndown(html));
// <kbd>STRG</kbd> `test`
console.log(turndownService.turndown(htmlUnbalancedPTag));
// <kbd>STRG</kbd> `test`
Strange, i have retested this again without to change the code at least for turndow and this time it works as expected. <kbd>
is kept.
One additional question to this: It seems logic in the first place, that when i keep <kbd>
, that this keeps also inline CSSstyling information. Is there some way to automatically clean the <kbd>
tag? Let me first explain, what i try to do: I select some text on a Webpage, which i didn't control. So don't know, what style information is there. After selecting the text, i copy this via STRG+C to the clipboard. No i go to my textarea or editor and paste the clipboard. To this textarea, i have bound turndown with an "onPaste" event. I send clipboardData.getData('text/html'). I want to keep the kbd tag, but clean up all style information. Is there any way with turndown or must i sanitize the tag by myself?
After trying the lib a little bit, i wonder if my approach of cleaning kbd tag is the right way:
var options = {
headingStyle: 'atx',
hr: '-',
bulletListMarker: '-',
codeBlockStyle: 'fenced',
fence: '```',
emDelimiter: '*',
strongDelimiter: '**',
linkStyle: 'inlined',
linkReferenceStyle: 'full',
collapseMultipleWhitespaces: true,
preformattedCode: true,
};
var turndownService = new window.TurndownService(options);
turndownService.keep(['kbd', 'ins']);
turndownService.addRule('kbd',{
filter:['kbd'],
replacement: function(content) {
return '<kbd>' + content + '</kbd>';
}
});
document.getElementByID('mytextarea').value(turndownService.turndown(pastedHTML));
BTW, it seems i have found the main issue, that the kbd was not included in the output. If i copy from a webpage from within Firefox (v85.x) the clipBoardData has no kbd tag. If i do the same with Chromium (v88.x), the kbd tag is available. I have no idea, why there is a difference. But it seems, it has nothing to do with you library. Maybe another of this obscure Firefox issues of the last month. I start to hate Firefox a little bit.
After trying the lib a little bit, i wonder if my approach of cleaning kbd tag is the right way:
Yes, that's correct. And you probably don't want to keep()
the kbd
tag when you have a rule for it.
The "feature" of the current keep is that it keeps the whole subtree, when a "shallow" keep might match better some use cases. That's something definitely worth investigating as a Turndown's feature, but it's tricky to design it meaningfully and universally at the same time. For example, when I want a "shallow keep" of a table that cannot be converted to MD, I want to shallow copy not only table
, but also the nested table
-related tags, but not automatically the tags of an another nested table should it be there. :)
Im using the following code:
But from:
i got:
Is there something, I'm doing wrong?