tj / node-querystring

querystring parser for node and the browser - supporting nesting (used by Express, Connect, etc)
MIT License
455 stars 66 forks source link

Handle partially parsed objects (w/ querystrings as keys) #27

Open kainosnoema opened 12 years ago

kainosnoema commented 12 years ago

tl;dr

For multipart forms, particularly from node-formidable. This particular use case is important:

We can use mimetypes to partially parse data from node-formidable parts, before using qs to reassemble


// Raw POST body:
//
// --xxxxxxxx
// Content-Disposition: form-data; name="foo"
// Content-Type: application/json
// 
// {"bar": "baz"}
// --xxxxxxxx
// Content-Disposition: form-data; name="foo[qux]"
// Content-Type: image/jpeg
// 
// (... image data ...)
// --xxxxxxxx--

// Simplified example:

> console.log(qs.parse({
>   foo: JSON.parse("{\"bar\": \"baz\"}")
> , 'foo[qux]': [object Object] // JPEG data in Formidable.File object
> }));
>
{
  foo: {
    bar: "baz",
    qux: [object Object] // JPEG data in Formidable.File object
  }
}

Does we really need this?

Yes. The RestKit iOS framework (among others) can send multipart forms containing both application/json and attachment (ie. image/jpeg) data. Properly merging the parts back into a single params object is crucial. The ExpressJS body-parser got us most of the way there—although we did have to override handlePart() so we could parse the JSON based on the mimetype.

The final piece of the puzzle is getting qs to reassemble the parts from the querystring paths as expected. As I mentioned in the commit message, qs was doing this properly in some cases, but not all (depending on part order from node-formidable, which is apparently non-deterministic). This fix should handle most cases as expected.

kainosnoema commented 12 years ago

Ok, so I've found a weird edge case that I'd like to get your feedback on before you do anything with this:

Current behavior:

> console.log(qs.parse({ 'foo[items]': [], foo: { items: ['bar'] } }))
{ "foo": { "items": [ [ "bar" ] ] } }

> console.log(qs.parse({ foo: { items: ['bar'] }, 'foo[items]': [] }))
{ "foo": { "items": [ "bar", [] ] } }

Oddly enough, we actually ran into this situation with RestKit (starting to hate it). We were expecting a consistent output, regardless of order, but unfortunately there isn't one when you use the "rules" of querystrings: that repeated keys are pushed together into arrays. The output above actually makes sense given the rules—its just not what we expected or wanted.

One possible solution would be to use concat() instead of push() under the set() function, since with non-Array objects, concat() behaves similarly to push() (so we keep existing behavior for normal querystrings—all tests still pass). At least the result is consistent regardless of order, but again, this isn't standard querystring behavior:

Using concat() instead of push():

> console.log(qs.parse({ 'foo[items]': [], foo: { items: ['bar'] } }))
{ "foo": { "items": [ "bar" ] } }

> console.log(qs.parse({ foo: { items: ['bar'] }, 'foo[items]': [] }))
{ "foo": { "items": [ "bar" ] } }

To be honest, I don't really like either option—this whole thing seems really weird.

Thoughts?

tj commented 12 years ago

my vote is on the second I think