ljharb / qs

A querystring parser with nesting support
BSD 3-Clause "New" or "Revised" License
8.47k stars 731 forks source link

Incorrect parsing of nested params with closing square bracket ] in property name #493

Open gronke opened 6 months ago

gronke commented 6 months ago

Search parameters with closing square bracket in nested property names are parsed incorrectly.

Here demonstrated with /?search[withbracket[]]=foobar:

> var qs = require("qs");
undefined
> input = { search: { "withbracket[]": "foobar" }}
{ search: { 'withbracket[]': 'foobar' } }
> output = qs.stringify(input, { encode: false })
'search[withbracket[]]=foobar'
> result = qs.parse(output)
{ 'search[withbracket': [ 'foobar' ] }
> assert.deepEqual(input, result)
Uncaught AssertionError [ERR_ASSERTION]: Expected values to be loosely deep-equal:

{
  search: {
    'withbracket[]': 'foobar'
  }
}

should loosely deep-equal

{
  'search[withbracket': [
    'foobar'
  ]
}
    at REPL280:1:8
    at Script.runInThisContext (node:vm:129:12)
    at REPLServer.defaultEval (node:repl:572:29)
    at bound (node:domain:433:15)
    at REPLServer.runBound [as eval] (node:domain:444:12)
    at REPLServer.onLine (node:repl:902:10)
    at REPLServer.emit (node:events:525:35)
    at REPLServer.emit (node:domain:489:12)
    at [_onLine] [as _onLine] (node:internal/readline/interface:425:12)
    at [_line] [as _line] (node:internal/readline/interface:886:18) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: [Object],
  expected: [Object],
  operator: 'deepEqual'
}

For comparison the result with the native URL object:

> url = new URL("http://example.com/?search[withbracket[]]=foobar")
URL {
  href: 'http://example.com/?search[withbracket[]]=foobar',
  origin: 'http://example.com',
  protocol: 'http:',
  username: '',
  password: '',
  host: 'example.com',
  hostname: 'example.com',
  port: '',
  pathname: '/',
  search: '?search[withbracket[]]=foobar',
  searchParams: URLSearchParams { 'search[withbracket[]]' => 'foobar' },
  hash: ''
}
> [...url.searchParams.keys()]
[ 'search[withbracket[]]' ]

When parsing a query without braces, the result looks like:

> qs.parse("search[okay]=baz")
{ search: { okay: 'baz' } }

So it should be a nested object.

Current Result

{ "search[withbracket": ["foobar"] }

Expected Result

{ "search": { "withbracket[]": ["foobar"] }}
techouse commented 3 months ago

I recently came across this weird query string combination.

Since I also (attempt to) maintain a Python and a Dart port of this awesome library I attempted to fix it in Python and Dart respectively.

The Python fix https://github.com/techouse/qs_codec/pull/1 involved addressing the regular expression

brackets: re.Pattern[str] = re.compile(r"(\[[^[\]]*])")

which matches a left bracket [, followed by any number of characters that are not brackets, and then a right bracket ]. This is why it matches the innermost brackets and produces the same wrong result as stated above.

To match the outermost brackets, including nested brackets, we must use a recursive regular expression. However, Python's re module doesn't support recursive regular expressions. This is where I had to resort to the regex module instead to achieve this

from regex import regex

brackets: regex.Pattern[str] = regex.compile(r"\[(?:[^\[\]]|(?R))*\]")

This regular expression matches a left bracket [, followed by any number of characters that are not brackets or another matching pair of brackets, and then a right bracket ]. The (?R) is a recursive subpattern that matches the entire pattern.

In the Dart fix https://github.com/techouse/qs/pull/13 I used recursive_regex to achieve the same and get from

final RegExp brackets = RegExp(r'(\[[^[\]]*])');

to

import 'package:recursive_regex/recursive_regex.dart';

final RegExp brackets = RecursiveRegex(startDelimiter: '[', endDelimiter: ']');

I'm not sure how best to implement this fix here, since JavaScript does not provide the PCRE recursive parameter (?R), however, this StackOverflow answer suggests one could use XRegExp which supports recursive matching via a plugin.

Don't we all just love regular expressions 🙈

To be honest, this issue is quite fringe and a heavy dependency could be a dealbreaker for more people than are willing to accept the lack of this fix.

In my opinion, it's ultimately down to @ljharb to give the go-ahead on this or not.

Thoughts?

CC / @gronke

ljharb commented 3 months ago

It should definitely be fixed, but I’m not excited about relying on regular expressions for a non-regular DSL :-)