var html = "<a href=\"http://www.somesite.com?a=1&b=2\">some text</a>";
var sanitizer = new HtmlSanitizer();
var sanitizedHtml = sanitizer.Sanitize(html);
after running, sanitizedHtml will be:
<a href="http://www.somesite.com?a=1&b=2">some text</a>
Notice that & was changed to & While it's of course true that the html encoding of & is & I feel like in this case it should not get encoded because the & occurs in the href URL as a query parameter separator. I'm concerned that the url may no longer function properly when the rendered link is clicked and the site visited.
I did read issue https://github.com/mganss/HtmlSanitizer/issues/116 but the request there is a bit different. Unlike that issue I'm not suggesting that all & chars be un-encoded, just the ones in the href of an anchor tag that are query param separators.
Consider the following case:
after running,
sanitizedHtml
will be:<a href="http://www.somesite.com?a=1&b=2">some text</a>
Notice that
&
was changed to&
While it's of course true that the html encoding of&
is&
I feel like in this case it should not get encoded because the&
occurs in thehref
URL as a query parameter separator. I'm concerned that the url may no longer function properly when the rendered link is clicked and the site visited.I did read issue https://github.com/mganss/HtmlSanitizer/issues/116 but the request there is a bit different. Unlike that issue I'm not suggesting that all
&
chars be un-encoded, just the ones in thehref
of an anchor tag that are query param separators.Thoughts?