hexojs / hexo

A fast, simple & powerful blog framework, powered by Node.js.
https://hexo.io
MIT License
38.96k stars 4.78k forks source link

{% raw %} tags shown when mixed triple backtick and raw blocks are used with hljs #3543

Closed phanan closed 4 years ago

phanan commented 5 years ago

On vuejs.org, we're encountering a weird issue where {% raw %} and {% endraw %} tags are visible with a very specific set of settings.

What does it do?

To reproduce the bug with minimal settings.

How to test

git clone -b raw-bug-poc https://github.com/phanan-forks/hexo.git
cd hexo
npm install
npm test

You'll notice that with the provided stub (content) AND hexo.config.highlight.hljs set to true, hexo will leave the {% raw %} and {% endraw %} tags untouched, and the (only) test case will fail with this assertion error:

+ expected - actual

<figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs js">alert(<span class="hljs-string">"Foo"</span>)<br></code></pre></td></tr></table></figure>
-{% raw %}
+
<p>Foo</p>
-{% endraw %}
+
<figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs html"><span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span><span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span><br></code></pre></td></tr></table></figure>

Screenshots

None.

Pull request tasks

phanan commented 5 years ago

Original (failing) PR: #3459

seaoak commented 4 years ago

I found that lib/extend/tag.js causes this issue.

https://github.com/hexojs/hexo/blob/master/lib/extend/tag.js#L118

  str = str.replace(/<pre><code.*>[\s\S]*?<\/code><\/pre>/gm, escapeContent);

This code eats all of the raw block between two triple backtick code blocks.

Why does the option highlight.hljs affect this issue?

When highlight.hljs=true, the output of highlight() includes the string <pre><code class=".... When highlight.hljs=false, the output includes <span> element instead of <code> element (<pre><span class="...). This is because this issue is affected by the option highlight.hljs.

I investigate how correct this code (maybe just change RegExp to "non-greedy").

seaoak commented 4 years ago

(maybe just chagne RegExp to "non-greedy")

Above patch is not enough.

For example, the following source text is not rendered well:

{% raw %}
<pre><code class="dummy">
{% endraw %}

{% raw %}
<span>Foo</span>
{% endraw %}

{% raw %}
</code></pre>
{% endraw %}

The part from 1st endraw tag through 3rd raw tag remains the result (just same as this issue).

I continue to investigate more...

seaoak commented 4 years ago

I found that this code was introduced by PR #1652 (4 years ago).

Unfortunately I cannot read Chinese. And also all test cases pass even if I remove this code (practically revert PR #1652).

@hexojs/core Anyone can explain why PR #1652 is necessary?

SukkaW commented 4 years ago

@seaoak

I can't understand what that PR is doing even if I can read the Chinese. The only thing I know is the context of #1652 is #1624.

jiangtj commented 4 years ago

@SukkaW

In order to avoid errors caused by some conflicting writing in the code block with njk, like:

\```html
<span>{{ aaa || bbb }}</span>
\```

image Same in highlightjs https://github.com/hexojs/hexo/blob/663f1eb2e91610a1777a77cfed9c9e58cfbfd9bd/lib/hexo/post.js#L26 https://github.com/hexojs/hexo/blob/663f1eb2e91610a1777a77cfed9c9e58cfbfd9bd/lib/plugins/filter/before_post_render/backtick_code_block.js#L66

But I think this part of the code is bad, it is better to use {% raw%}

seaoak commented 4 years ago

@jiangtj Thank you for showing an example.

But the example can not be handled by current Hexo (regardless whether the code of tag.js exists or not).

I think codes for highlightjs work just as we intended.

I still wonder why the code of tag.js is necessary.

SukkaW commented 4 years ago

@jiangtj

We should not use {% raw %}. Hexo provide disableNunjucks for some renderer, which means when disableNunjucks is enabled, {%.raw %} will no longer do its jobs.