statiqdev / Statiq.Framework

A flexible and extensible static content generation framework for .NET.
https://statiq.dev/framework
MIT License
425 stars 74 forks source link

Link in HTML encoded #245

Open pascalberger opened 2 years ago

pascalberger commented 2 years ago

I've a link in a markdown file containing some URL parameters:

* [Test link](https://example.com?foo=bar&bar=foo

When HTML document is generated the link will be URL encoded, resulting in an invalid link:

<li><a href="https://example.com/?foo=bar&amp;amp;bar=foo">Test link</a></li>
pascalberger commented 2 years ago

Might be a duplicate of #170 or part of what was discussed in #170, but not sure if this is supposed to be fixed with https://github.com/statiqdev/Statiq.Framework/commit/97273728c8f3f4085f1a275e2077b198ff8d31b7.

daveaglick commented 2 years ago

Yeah, I could swear I worked on this exact thing not too long ago (though it's been a crazy summer for me, so it's also just as likely I'm thinking of something totally different).

I'm also wondering if the Markdown parser is getting in the middle and encoding it before I see it (or something related to that process on my end).

I'll take a look - I'm assuming you're on the latest version?

pascalberger commented 2 years ago

I'll take a look - I'm assuming you're on the latest version?

Thanks! Yes, this is with a site built with Statiq.Web 1.0.0-beta.49

pascalberger commented 1 year ago

I'm also wondering if the Markdown parser is getting in the middle and encoding it before I see it (or something related to that process on my end).

@daveaglick I can confirm that it is the Markdown parser which is responsible for the behavior. I created a test case in #262

pascalberger commented 1 year ago

Related Markdig issue: https://github.com/xoofx/markdig/issues/514

pascalberger commented 1 year ago

While CommonMark expects & in links to be rendered as &amp; and browsers would handle it, the problem here is that Statiq.Razor converts it to &amp;amp;.

Markdown:

[A link](https://example.com/?foo=bar&baz=123)

Expected HTML output:

<p>
  <a href="https://example.com/?foo=bar&amp;baz=123">
    A link
  </a>
</p>

Actual HTML output:

<p>
  <a href="https://example.com/?foo=bar&amp;amp;baz=123">
    A link
  </a>
</p>

I updated https://github.com/statiqdev/Statiq.Framework/pull/262 to contain a test case, with an link as generated by Statiq.Markdown, which shows that Statiq.Razor renders the link invalid.

pascalberger commented 1 year ago

Issue is caused by the use of TagBuilder, which will encode the link a second time: https://github.com/statiqdev/Statiq.Framework/blob/e990cb1b4d0884dc1e3f9371aa1d55551c5ecf18/src/extensions/Statiq.Razor/IHtmlHelperExtensions.cs#L289

pascalberger commented 1 year ago

@daveaglick Not sure what's the best way to proceed here: Currently Statiq.Markdown (Markdig) and Statiq.Razor (TagBuilder) are encoding. We need to get rid of one of those. There's no option for either Markdig or TagBuilder to not encode and would need additional code on either side.

IMHO best solution would be to not have it encoded in Statiq.Markdown in cases where Statiq.Razor runs afterwards in the pipeline, but not sure how this can be detected.