keithmifsud / jekyll-target-blank

Automatically opens external links in a new browser for Jekyll Pages, Posts and Docs.
MIT License
105 stars 19 forks source link

Causes HTML named character entities to be rendered incorrectly #70

Open qitianshi opened 12 months ago

qitianshi commented 12 months ago

Hi everyone. When this plugin is used, it causes some HTML named character entities to become unparsed, i.e. they're rendered as the raw entity string instead of the correct character. For example, … is literally displayed as … instead of "...". They render correctly when this plugin is removed.

It appears that it only breaks for HTML 5.0 named character entities (reference list), but pre-5.0 entities still work. Additionally, the equivalent hexadecimal and decimal entities (e.g. … and …) work even for the 5.0 characters.

As HTML 5.0 introduces new characters not present in previous standards, and it is good practice to use the named entity for ease of code maintenance, I believe this issue should be resolved.

Technical details

jekyll-target-blank: v2.0.2 (standard config, no additional options in _config.yml) jekyll: v4.3.2 Other plugins: jekyll-autoprefixer, jekyll-minifier Browser: Chrome 119, Safari 17 OS: macOS 14 Sonoma

keithmifsud commented 12 months ago

Thank you for reporting this @qitianshi. We may be able to support more entities if we upgrade the Nokogiri dependency.

I'd appreciate it if someone can help. If so, please try to update the dependencies, increment this plugin's minor version, and submit a PR.

prplecake commented 12 months ago

It won't be that easy. A new test project I created uses the latest version of Nokogiri available (1.15.5) but the issue persists.

And we're not the first to see it. https://github.com/sparklemotion/nokogiri/issues/1127

I don't know why disabling the plugin makes a difference...

https://github.com/prplecake/jekyll-target-blank/blob/800cb75714f5fed9d002c9948a9f28eef717d593/lib/jekyll-target-blank.rb#L30-L31

I added a puts before returning out of process(), which should show the content that jekyll passed to the plugin before the plugin has processed anything.

Given the following markdown:

[a link](https://example.com)
»

…

…

<a href="test">&mldr;</a>

the following HTML is given to the plugin by jekyll:

<p><a href="https://example.com">a link</a>
»</p>

<p>&amp;mldr;</p>

<p>…</p>

<p><a href="test">&amp;mldr;</a></p>

Why does enabling the plugin change the HTML jekyll produces before the plugin has processed anything?

keithmifsud commented 11 months ago

Why does enabling the plugin change the HTML jekyll produces before the plugin has processed anything?

Not sure how you came to this conclusion.

I suggest adding a test; there's quite a lot you can use as an example. Then, test with spaces and Unicode parameters, or add another condition in the main class for these two chars. I'm happy to approve a PR if you add tests.