medialize / URI.js

Javascript URL mutation library
http://medialize.github.io/URI.js/
MIT License
6.26k stars 474 forks source link

Hostname detection fails for URLs starting with https:/// #365

Open konstantinblaesi opened 6 years ago

konstantinblaesi commented 6 years ago
const URI = require('urijs');

URI.preventInvalidHostname = false;
const url = URI('https:///cdn.shopify.com/s/files/1/0333/9621/files/forbiiden_banner_small.jpg?2662936180569856865');

url._parts looks like this:

{
  potocol: 'https',
  username: null,
  password: null,
  hostname: null,
  urn: null,
  port: null,
  path: '/cdn.shopify.com/s/files/1/0333/9621/files/forbiiden_banner_small.jpg',
  query: '2662936180569856865',
  fragment: null,
  preventInvalidHostname: false,
  duplicateQueryParameters: false,
  escapeQuerySpace: true
}

When setting URI.preventInvalidHostname = true;

Uncaught TypeError: Hostname cannot be empty, if protocol is https

is thrown. This was initially reported for simplecrawler/simplecrawler#415. Chrome handles the URL without issues. Do you think URI.js should be more liberal and fix the URL before parsing it?

rodneyrehm commented 6 years ago

URL, Firefox, Chrome and curl agree, they rewrite https:/// to https://. URL, Chrome and Firefox even rewrite https:////, but curl reports curl: (3) Bad URL.

The question is for which schemes/protocols this should apply, an obvious exception is file://.

Do you want to give this a try?