defunctzombie / node-url

node.js core url module as a module
MIT License
376 stars 96 forks source link

[Bug] Sub querystring #45

Closed Roriz closed 5 years ago

Roriz commented 5 years ago

Version

0.11.0

Link to minimal reproduction

https://runkit.com/roriz/url-bug-subquerysing

Steps to reproduce

  1. parse url with another querystring inside on first querystring, example:
    url.parse('http://example.com?a=http://example.com?b=1&c=1', true)

What is expected?

Parse right querystring ignoring sub querystring

Url {
  protocol: 'http:',
  slashes: true,
  auth: null,
  host: 'example.com',
  port: null,
  hostname: 'example.com',
  hash: null,
  search: '?a=http://example.com?b=1&c=1',
  query: { a: 'http://example.com?b=1&c=1'},
  pathname: '/',
  path: '/?a=http://example.com?b=1&c=1',
  href: 'http://example.com/?a=http://example.com?b=1&c=1' }

What is actually happening?

The parse get inside querystring to outside:

Url {
  protocol: 'http:',
  slashes: true,
  auth: null,
  host: 'example.com',
  port: null,
  hostname: 'example.com',
  hash: null,
  search: '?a=http://example.com?b=1&c=1',
  query: { a: 'http://example.com?b=1', c: '1' },
  pathname: '/',
  path: '/?a=http://example.com?b=1&c=1',
  href: 'http://example.com/?a=http://example.com?b=1&c=1' }
Dru89 commented 5 years ago

Pretty sure this module is doing the right behavior here.

Your query values should be encoded with encodeURIComponent.

Example:

const url = `http://example.com?a=${encodeURIComponent("http://example.com?b=1&c=1")}`
// url === "https://example.com?a=http%3A%2F%2Fexample.com%3Fb%3D1%26c%3D1"

url.parse("https://example.com?a=http%3A%2F%2Fexample.com%3Fb%3D1%26c%3D1", true)
/*
Url {
  protocol: 'https:',
  slashes: true,
  auth: null,
  host: 'example.com',
  port: null,
  hostname: 'example.com',
  hash: null,
  search: '?a=http%3A%2F%2Fexample.com%3Fb%3D1%26c%3D1',
  query: [Object: null prototype] { a: 'http://example.com?b=1&c=1' },
  pathname: '/',
  path: '/?a=http%3A%2F%2Fexample.com%3Fb%3D1%26c%3D1',
  href:
   'https://example.com/?a=http%3A%2F%2Fexample.com%3Fb%3D1%26c%3D1' }
*/

FWIW, this is how Node's URL module works, too because "subqueries" like this aren't supported by RFC 3986. Otherwise, there'd be no way to tell when the "subquery" ended and the "main query" began again.