zloirock / core-js

Standard Library
MIT License
24.35k stars 1.64k forks source link

URL punycode differs from nodejs / chrome behaviour #1223

Open amardeep opened 1 year ago

amardeep commented 1 year ago

Consider the following url which has non-ascii characters: https://𝚍𝚒𝚜𝚌𝚘𝚛𝚍.gg

While trying to parse this for the hostname, both nodejs and chrome return ascii string discord.gg but corejs returns xn--ci2hbbs5ase.gg

Here is the code:

import configurator from 'core-js-pure/configurator.js';

configurator({
    // By default polyfills are not used if they are available natively.
    usePolyfill: ['URL'], // Override that behaviour for URL.
});

import URL from 'core-js-pure/web/url.js'; // For URL

const url = new URL('https://𝚍𝚒𝚜𝚌𝚘𝚛𝚍.gg');
console.log(url.hostname); 
zloirock commented 1 year ago

Yes, I can confirm it. core-js URL punycode logic is not perfect (and I'm not sure that a complete acceptable fix for that is possible).

I can work on this issue only after some days, so if someone wanna work on it before - feel free.

tasawar-hussain commented 1 year ago

@zloirock It looks interesting, I can start looking into it, if you haven't already

zloirock commented 1 year ago

@tasawar-hussain 👍

ehoogeveen-medweb commented 1 year ago

I don't know if it would be useful (as it is written in C++), but Node.js recently switched to ada for URL parsing, and this uses idna for converting between unicode and ascii.

Maybe some inspiration could be taken from their utf32_to_punycode implementation, which seems relatively short and free of dependencies (though obviously JS doesn't start from UTF32).

iTsingchen commented 1 month ago

I'm trying to use pdfjs on the lower version of Chrome. There is a piece of code used to determine whether the worker src is of the same origin. When using the blob url as the worker src, it will be judged as false. Here is an example below.

https://github.com/mozilla/pdf.js/blob/63371eaed8326f1ba4d4cdf6a1360a9333bd0bcf/src/display/api.js#L2029-L2041

      this._isSameOrigin = (baseUrl, otherUrl) => {
        let base;
        try {
          base = new URL(baseUrl);
          if (!base.origin || base.origin === "null") {
            return false; // non-HTTP url
          }
        } catch {
          return false;
        }
        const other = new URL(otherUrl, base);
        return base.origin === other.origin;
      };

https://stackblitz.com/edit/vitejs-vite-kekzvf?embed=1&file=url.js

image
zloirock commented 1 month ago

@iTsingchen could you create a separate issue?