thephpleague / commonmark

Highly-extensible PHP Markdown parser which fully supports the CommonMark and GFM specs.
https://commonmark.thephpleague.com
BSD 3-Clause "New" or "Revised" License
2.75k stars 194 forks source link

HeadingPermalinkExtension does not percent-encode CJK names, but commonmark does percent-encode CJK URLs #1021

Open ptmkenny opened 7 months ago

ptmkenny commented 7 months ago

Version(s) affected

2.4.2

Description

This is an edge case, but when you have a relative link with CJK text and the HeadingPermalinkExtension enabled, then the relative link gets percent-encoded (because Commonmark percent-encodes links), but the name that the relative link is pointing to remains the original CJK text.

This is a minor issue because the link works. However, it can cause automated tests like pa11y to fail because the relative link is pointing to a percent-encoded name, but the name is not percent encoded.

How to reproduce

Here is a link: [Go to summary](#まとめ)

## まとめ

Summary

Produces:

<p>Here is a link: <a href="#%E3%81%BE%E3%81%A8%E3%82%81">Go to summary</a></p>
<h2 name="まとめ">まとめ</h2>
<p>Summary</p>

Possible solution

I know this is a minor issue, but for the sake of consistency I think it would be great to percent-encode the CJK in the names processed by HeadingPermalinkExtension in the same manner as links.

Additional context

Did this project help you today? Did it make you happy in any way?

No response