Closed larseggert closed 2 years ago
Thank you for the bug report! I'd appreciate a pull request from anyone who wants to tackle this. I don't think I'm going to get to it.
I tried to wrap a urllib.parse.quote()
around the the match.group(0)
bit in
https://github.com/mozilla/bleach/blob/481b146b074ed004eab39abf8f9b964fcd61c408/bleach/linkifier.py#L304
but that seems to have no effect.
I have noticed similar problem with clean()
function. Maybe it has the same root cause.
Example:
In [1]: import bleach
In [2]: bleach.clean("<a href='https://example.org?a=1&b=2'>example</a>")
Out[2]: '<a href="https://example.org?a=1&b=2">example</a>'
Notice that &
is changed to &
.
@jozo that's not the same thing. The &
should be escaped to &
.
Describe the bug
bug:
linkify
withparse_email=True
doesn't handle "%" and "?", which may occur in RFC822 addr-specs (see https://datatracker.ietf.org/doc/html/rfc2368#section-6)To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expected RFC822 special characters to be percent-encoded according to RFC2368:
Additional context
Same issue exists with "?"; I didn't test other RFC822 special characters but suspect they are similarly left unquoted.