curbengh / hexo-nofollow

Adds nofollow attribute to all external links in your hexo blog posts automatically.
MIT License
1 stars 0 forks source link

cheerio.load() with {decodeEntities: false} causes html escaped text to be unescaped. #3

Closed momocow closed 4 years ago

momocow commented 5 years ago

You can try this within your Node REPL with cheerio@1.0.0-rc.2.

let $ = require('cheerio')
$.load('<test>', {decodeEntities: false}).html()
//  '<html><head></head><body><test></body></html>'

Hexo depends on cheerio@0.22.0, the same script above behave differently.

let $ = require('cheerio')
$.load('&lt;test&gt;', {decodeEntities: false}).html()
// '&lt;test&gt;'

The main problem is that, in your after_render:html filter, you load the content of each rendered HTML and return the new content with cheerio.html(), which causes html escaped text to be rendered as unescaped elements.

I noticed this problem since I put HTML codes in my code block (I have checked that the markdown-it renderer which I'm using escaped them properly), and they were not rendered as pure text as expected.

momocow commented 5 years ago

I found the issue on cheerio. To preserve the code block, { decodeEntities: true } is used, however, it introduces another issue with Chinese (or other language) characters. 😂

cheeriojs/cheerio#1198