showdownjs / showdown

A bidirectional Markdown to HTML to Markdown converter written in Javascript
http://www.showdownjs.com/
MIT License
14.19k stars 1.56k forks source link

helper.URLUtils providing inaccurate values in testing #939

Closed gdog2u closed 2 years ago

gdog2u commented 2 years ago

I am working on a PR that modifies the way links are parsed. A part of this requires checking the link's destination domain against the current domain. To do this, I am trying to using showdown.helper.URLUtils to parse the url value in Links.js > writeAnchorTag. After much time troubleshooting, I am seeing that host and hostname keep coming back empty.

Relevant code:

    const urlMeta = new showdown.helper.URLUtils(url);
    if (options.xfnRelAutoMe && (urlMeta.host === '' || urlMeta.host === showdown.helper.window.location.host)) {
      attributes.rel = urlMeta.pathname;
    }

As a means of debugging, I set the script to print pathname, and found that the whole link is being passed through there. Below is the result of attributes.rel = urlMeta.pathname

     + expected - actual

      -<p><a href="/" rel="/">Home</a>
      -<a href="index.html" rel="index.html">Home</a>
      -<a href="/index.html" rel="/index.html">Home</a>
      -<a href="../index.html" rel="../index.html">Home</a>
      -<a href="/../index.html" rel="/../index.html">Home</a>
      -<a href="https://example.com/" rel="https://example.com/">Home</a>  // Should be rel="/"
      -<a href="https://google.com" rel="https://google.com">Google</a>     // Should be rel=""
      -<a href="//google.com" rel="">Google</a></p>

Of note, when I copy/paste the showdown.helper.URLUtils, (e.g. showdown.helper.URLUtils('https://google.com')) function into the browser console, and try running it there, I get the expected results.

Edit: Troubleshooting has shown that this is most likely because the parser performing some sanitation at this point. https:// becomes https¨E58E// This disallows URLUtils to properly parse the url.

gdog2u commented 2 years ago

The following line in writeAnchorTag() is responsible for changing url by the time I was accessing it.

url = url.replace(showdown.helper.regexes.asteriskDashTildeAndColon, showdown.helper.escapeCharactersCallback);

Added variable to store a clean copy of the url.