Invalid html generated when parsing link

rchl commented 2 years ago

Python 3.3>>> mdpopups.show_popup(view, 'and [`i32`](`i32`) is')
Parse Error: ">i32</span>"><code class="highlight"><span style="color: #d8dee9;">i32</span></code></a> is</p></div> code: Unexpected character

The generated HTML is:

and <a href="<span style="color: #d8dee9;">i32</span>"><code class="highlight"><span style="color: #d8dee9;">i32</span></code></a> is

Which is invalid since the content of the href attribute is not escaped.

The escaping would make the code valid but I believe that what should happen is nothing in the link should be transformed into HTML and instead it should be taken literally, and then also escaped if needed.

This is how Github renders it:

and i32 is

and <a href="%60i32%60"><code class="notranslate">i32</code></a> is

facelessuser commented 2 years ago

Yeah, I'm actually aware of this, and I'll have a fix for the next version. There was some complaint at some point about the fact that when Sublime was processing callbacks in HTML links that escaped quotes weren't handled properly by Sublime. A change was made to not escape them, which is simply a bad idea that was rarely encountered. We just need to let them be escaped.

Anyways, this is an issue in mdpopups, not any of the underlying libraries.

rchl commented 2 years ago

And the fixed behavior will also not convert markdown to html in links?

facelessuser commented 2 years ago

And the fixed behavior will also not covert markdown to html in links?

I'm not sure I understand the question. We are talking about how proper escaping in HTML attributes right?

What do you mean by your question, can you clarify with an example?

rchl commented 2 years ago

The fixed logic could result in either:

<a href="%60i32%60"> or
<a href="<span style="color: #d8dee9;">i32</span>">

I believe that the former one is correct since markdown should not be processed in URIs.

facelessuser commented 2 years ago

Can you give me a reproducible Markdown source? I see the expected outputs, but I'm still not sure what you are providing as source.

facelessuser commented 2 years ago

Wait, are you trying to provide Markdown code syntax as a link? If so, you aren't getting that to convert properly. You need to provide a link or properly escaped content for reference.

facelessuser commented 2 years ago

You'll see all sorts of different outputs with different Markdown parsers. Some might handle it more sane than others, but you should not be doing this: https://johnmacfarlane.net/babelmark2/?text=%5B%60i32%60%5D(%60i32%60)

facelessuser commented 2 years ago

I honestly thought you were talking about something else.

rchl commented 2 years ago

It's the same example I gave in the initial comment:

'and [`i32`](`i32`) is'

It contains backticks in the link's URI and it comes from the LSP server.

I've talked about it in the initial comment and also shown how Github's parser handles it.

rchl commented 2 years ago

Just escaping the link will fix the main issue that causes parsing error so it might be enough to do that.

Since a link like the one provided here as an example will likely not do anything useful, it might as well contain escaped HTML but it feels like the ultimate fix would be to not process markdown in links, besides escaping.

facelessuser commented 2 years ago

It's the same example I gave in the initial comment:

Yes, I was on my phone, but I see that now. Basically, code is handled before URLs, which is why this happens. That is just the sequencing for how Python Markdown does things.

I've talked about it in the initial comment and also shown how Github's parser handles it.

Yep, how GitHub handles this is irrelevant. Currently, we are using the Python Markdown parser.

I've mentioned at one time that I'm potentially willing to add support for a CommonMark parser as soon as Package Control supports Py38 dependencies, though it's starting to feel like they may never happen...

Anyways, if you format the content in an escaped manner, it'll probably go through fine. I don't plan on patching the underlying library to workaround odd cases that it was not designed to handle. One could argue this is a bug, but it would be an upstream bug with Python Markdown, but as I'm familiar with its architecture, I'm doubtful a fix will be coming.

facelessuser commented 2 years ago

@rchl Hmm, there may be an issue with how we do things using the Sublime highlighter. Turning off the Sublime highlighter (which causes Pygments to be used instead) doesn't seem to exhibit this issue.

I wanted to do my due diligence before closing this issue, so there may actually be something for us to fix here. I'll try to dig deeper.

facelessuser commented 2 years ago

Hmm, it just appears that when Pygments handles a plain text code, it simply returns the text and styles it with CSS, but with the Sublime highlighter, we actually add inline styles, which is why we get HTML content. Really, Markdown should not be used as a URL.

I'm not sure if there is anything we can reasonably do, but I'll take a look if there is something simple we can do without trying to monkey patch Markdown itself.

facelessuser commented 2 years ago

Yeah, we can break pygments with [`123`](`#!py3 123`) which will then perform highlighting which causes spans to be written:

<p><a href="<span class="mi">123</span>
"><code class="highlight">123
</code></a></p>

facelessuser / sublime-markdown-popups

Invalid html generated when parsing link #130