hexojs / hexo-util

Utilities for Hexo.
MIT License
90 stars 60 forks source link

toc helper should not parse the text of children element(s) #174

Closed noraj closed 4 years ago

noraj commented 4 years ago

Check List

Please check followings before submitting a new feature request.

Feature Request

Fix the toc helper to select only text content and not other sub-elements like links.

In depth explanation here: https://github.com/hexojs/hexo-renderer-markdown-it/issues/102

SukkaW commented 4 years ago

@noraj

Fix the toc helper to select only text content and not other sub-elements like links.

Is this issue show up in Hexo 4.1.1? I am wondering if the problem is the different behavior between htmlparser2 & cheerio.

noraj commented 4 years ago

@SukkaW I updated https://github.com/hexojs/hexo-renderer-markdown-it/issues/102

curbengh commented 4 years ago

Can confirm the issue in hexo-util 1.8.1:

const { tocObj } = require('hexo-util')

const text = '<h3 id="create-a-new-post"><a class="header-anchor" href="#create-a-new-post">#</a>Create a new post</h3>'

console.log(tocObj(text))
/*
[
  {
    text: '#Create a new post',
    id: 'create-a-new-post',
    level: 3
  }
]
*/
SukkaW commented 4 years ago

@curbengh

<h1 id="title-3"><span>Title</span> 3</h3>

In this case Title 3 is what we needed for text.

curbengh commented 4 years ago

@SukkaW

Since the output is '\n#\nCreate a new post\n', what if tocObj trim the newlines and anything before and after \n? It may be more appropriate to trim in toc helper instead, since tocObj user might want to retain the original (rare, but possible).

Edit: no newline by default

SukkaW commented 4 years ago

@curbengh

Currently,

<h3 id="create-a-new-post">
<a class="header-anchor" href="#create-a-new-post">#</a>Create a new post
</h3>

will become \n#Create a new post\n. After trimming it will become Create a pos. Still not appropriate.


Update

cheerio has the same behavior.

https://runkit.com/sukkaw/5e186c9d286975001a1051c2

curbengh commented 4 years ago

Looks like a common question in cheerio/jquery,

Seems possible to fix, if htmlparser has the equivalent function of native DOM's document.childNodes or cheerio's $.first().

Microbai commented 4 years ago

When contents likes this

<h2><span id="title_1">title 1</span></h2>

tocObj won't parse the correct id, So I edit the getId function, then work.

const getId = ele => {
  const { id } = ele.attribs;
  const { children } = ele;
  return id || (!children[0] ? null : getId(children[0]));
};
SukkaW commented 4 years ago

@Microbai tocObj() is not designed to get id from child element but only from parent element.

Microbai commented 4 years ago

@SukkaW I Knew it.But when I use Hexo 4.2, the page can't render toc correctly.So I edit tocObj for now, and it works.