unshiftio / url-parse

Small footprint URL parser that works seamlessly across Node.js and browser environments.
http://unshift.io
MIT License
1.03k stars 104 forks source link

Incorrect handling of protocol-relative URLs in 1.5.x #219

Closed krassowski closed 2 years ago

krassowski commented 2 years ago
let url = new URL('//github.com');
url.toString()

results in ////github.com but it should result in //github.com.

offirgolan commented 2 years ago

+1 to this:

Screen Shot 2021-11-29 at 3 55 09 PM

FYI - Only seeing this issue in Node but its working correctly in the browser.

lpinca commented 2 years ago

Should be fixed on master. See https://github.com/unshiftio/url-parse/commit/d9e332b3cee790b6852152b707a4e39c00945f8c. Unfortunately I can't publish a new version.

offirgolan commented 2 years ago

@lpinca does it make sense for the pathname to be //github.com/foo/bar? Shouldn't it be correctly parsed?

lpinca commented 2 years ago

@offirgolan that URL is invalid without a base URL. The WHATWG URL parser throws an error for it. The result is consistent with the legacy Node.js URL parser.

offirgolan commented 2 years ago

Got it, thanks for the clarification. Do you happen to know how we can get a new release to be published?

lpinca commented 2 years ago

Try to ping @3rd-Eden on Twitter (https://twitter.com/3rdEden).

3rd-Eden commented 2 years ago

Should be fixed on master. See d9e332b. Unfortunately I can't publish a new version.

I need to figure a way to get stuff autopublished.

3rd-Eden commented 2 years ago

Closing this as I managed to publish this from my windows machine.

Note to self: Using linux commands in publish steps is a bad practice.

krassowski commented 2 years ago

Thank you. It still looks a bit weird because new URL('://github.com') is missing hostname, host and has a bit strange pathname when compared to new URL('https://github.com'):

> new URL('//github.com')
{
  slashes: true,
  protocol: '',
  hash: '',
  query: '',
  pathname: '//github.com',
  auth: '',
  host: '',
  port: '',
  hostname: '',
  password: '',
  username: '',
  origin: 'null',
  href: '//github.com'
}

compare to:

> new URL('https://github.com')
{
  slashes: true,
  protocol: 'https:',
  hash: '',
  query: '',
  pathname: '/',
  auth: '',
  host: 'github.com',
  port: '',
  hostname: 'github.com',
  password: '',
  username: '',
  origin: 'https://github.com',
  href: 'https://github.com/'
}

I would have expected that the only differences would be:

but instead there is a difference in pathname, missing hostname, null origin and lack of trailing slash for href.

Is this intended?

lpinca commented 2 years ago

@krassowski yes this is expected, see @offirgolan comment above. //github.com is an invalid URL without a base URL. The WHATWG URL parser throws an error without a base URL. This behavior is to be consistent with the legacy Node.js URL parser.