tj / node-querystring

querystring parser for node and the browser - supporting nesting (used by Express, Connect, etc)
MIT License
455 stars 66 forks source link

Implement nested objects under an array #6

Closed alimanfoo closed 13 years ago

alimanfoo commented 13 years ago

Hi, I'm new to gitub so apologies if this isn't the proper way to vote for a feature, but I noticed that the example:

var obj = qs.parse('users[][name][first]=tj&users[][name][last]=holowaychuk');
require('inspect')(obj)

...doesn't work as I'd hoped, and that the tests:

// 'test complex': function(){
//   qs.parse('users[][name][first]=tj&users[foo]=bar')
//     .should.eql({
//       users: [ { name: 'tj' }, { name: 'tobi' }, { foo: 'bar' }]
//     });
// 
//   qs.parse('users[][name][first]=tj&users[][name][first]=tobi')
//     .should.eql({
//       users: [ { name: 'tj' }, { name: 'tobi' }]
//     });
// }

...are commented out.

If you could support this kind of nesting of arrays and objects that would be great, we have to work with fairly complex forms with lots of repeating and nested sections, and full support for this would be really useful.

Cheers.

tj commented 13 years ago

it'll happen eventually, not a priority at the moment though I'm sorry to say

eleith commented 13 years ago

all the old tests pass as well as added new tests to explore the functionality of nested objects under arrays as well as nested arrays under objects

eleith commented 13 years ago

couldn't find a way in the github web GUI to attach a pull request to an issue, so used the json api (beta). instructions can be found here: http://develop.github.com/p/pulls.html

eleith commented 13 years ago

nope, didn't work. committed the wrong repo! doh. trying again...

eleith commented 13 years ago

nice. pull requested was maintained after nuking repo/fork and trying again.

eleith commented 13 years ago

understood.

i did this quickly for something i'm working on, now that it works, i am happy to clean it up/maintain your style if you can provide some suggestions.

eleith commented 13 years ago

better?

tj commented 13 years ago

yup :) looks better thanks man

dunkfordyce commented 13 years ago

this patch/fork doesnt handle following case it seems:

 'a[b][0][c]=1&a[b][]=2&a[b][]=3' ==> { a: { b: [{ c: 1 }, 2, 3 ] } }
eleith commented 13 years ago

dunkfordyce,

my patch does not take into account manually indicating numerical indices.

to get your desired about ouput, just use empty brackets:

'a[b][][c]=1&a[b][]=2&a[b][]=3' ==> { a: { b: [ { c: 1}, 2, 3,] }

whereas if you specify the index, this breaks things as you have noted:

'a[b][0][c]=1&a[b][]=2&a[b][]=3' ==> { a: { b: [ 2, 3,] } }

i didn't handle this case, because i was considering the case of submitting html forms.

when i want arrayed values to be submitted, i don't specify the name of form fields with indices but just use the empty brackets instead.

this does suggest that if you want things in a certain order, they need to appear in that particular order in the querystring itself.

dunkfordyce commented 13 years ago

oh! strange! its jquery that adds this '0':

decodeURIComponent($.param({ a: { b: [ { c: 1}, 2, 3,] }})) "a[b][0][c]=1&a[b][]=2&a[b][]=3"

eleith commented 13 years ago

dunkfordyce,

interesting, thanks for bringing that up. with how querystring works, it seems ambiguous when to use arrays or objects if you specify an index. i believe this makes for a good case to only use [] when you want an array and [x] when you want an object.

actually, if you always included an index, you get something that is a bit workable. for example:

"a[b][0][c]=1&a[b][1]=2&a[b][2]=3" ==> "a: {b: { 0: { c: 1 }, 1: 1, 2: 3 } }"

but if you specify a numerical index one time and then an empty index, the parser gets confused, because it has already begun working on adding to an object, so it throws away the object and begins working on the array. i'll keep my patch as is for now, as it seems to make the most sense when working with submitting html forms.

however, you did help me find another bug, and i have since updated it.

'a[b][][c]=1&a[b][]=2&a[b][]=3&a[b][][d]=4' ==> { a: { b: [ { c: 1}, 2, 3, { d: 4 }] }

this now works with my latest patch. so thank you for trying out new test cases!

tj commented 13 years ago

cool guys, appreciate the work on this. ideally IMO [] would always be present for an array but unfortunately it's out there so we gotta support it without.

eleith commented 13 years ago

i think the change to support [0] would be quite trivial if it was agreed this is the right thing to do...

tj commented 13 years ago

that's the problem i guess, tons of apps already just do what they do so we are kinda forced to support them. IMO items[]=foo&items[]=bar is an array but items=foo&items=bar is not for example but everyone does both so yeah

eleith commented 13 years ago

here are two cases two consider

items[name]=foo&items[0]=bar and items[0]=foo&items[]=foo. in one scenario, it would be expected that 0 is an index in a hash while the other scenario, the 0 is an index in an array.

what will get very confusing is when you have this scenario items[0]=foo&items[name]=bar, because as the parser sees this, '0' will indicate to start building up an array, but then when the parse sees 'name' it will throw away the array and replace it with an object.

jquery param will likely never do this however. so one path forward is to say that whatever is found first, indicates what the rest of the object will be. if you start out with [] or [0], you are working on an array, if you start out with [x] you are working on an object.

the caveat is that if you start by indicating you are using an array and then switch to an object, you'll lose the array (and visa versa).

thoughts? if the above statement is ok, i can add another tiny commit that will support this.

tj commented 13 years ago

we could just have properties on the array, which thankfully for us is perfectly valid, i do this with the express req.params, it's an array that just happens to have properties and sometimes indexed values when splats are given or a regexp with capture groups

dunkfordyce commented 13 years ago

does anyone think its worth submitting a bug report to jquery?

eleith commented 13 years ago

dunkfordyce, you're example should with the above patch.

basically, if the first time the parser sees the array as [0], it will assume it is building an array

if the parser sees any other named parameter, it will create an object.

however, once the parser sees a [], the current object it is gathering will be turned into an array.

this has the nice benefit of never throwing away key/value data when parsing.

i added new test cases to better visualize these rules.

dunkfordyce commented 13 years ago

eleith, thanks! i shall test it in "real" code later.

edit: actually i just looked at your commit instead of just reading my emails....

are you actually testing for '0]' because that will clearly break with the following for example:

>>> decodeURIComponent($.param({ a: { b: [ 2, { c: 1}, 3,] }}))
"a[b][]=2&a[b][1][c]=1&a[b][]=3"
eleith commented 13 years ago

dunkfordynce, the brief answer is, yes, but your example still will work.

the long answer is:

i do test for '0]' but not explicitly. the problem is 0 is originally seen as a named parameter, however, if the object has yet to be created, the parser will start out with an array rather than an object (which is why i not only test for '0]' but also that the current object being appended is empty)

if the object already exist, then 0 will become a named parameter (unless a '[]' is seen down the line).

take a look at the test cases i added to get a better idea of how the rules work for when an array is created, vs an object.

in the end, if you want an array, you want to use '[]' (which tells the parser either to append or create an array if it does not exist yet). [0] will allow for the creation of an array if it is the first time it is seen.

makes sense?

eleith commented 13 years ago

dunkfordyce,

i forgot to explain that i do look for numeric indices, such that if you have an array already, they will continue to push onto the array. this is the reason why your example will work.

sorry for the confusion, i thought you were asking something else at first.

alimanfoo commented 13 years ago

eleith,

I'm just trying to understand the following test...

qs.parse('users[][name][first]=tobi&users[][name][first]=tj&users[][name][last]=holowaychuk')
      .should.eql({
        users: [ { name: { first: 'tobi' } }, { name: { first: 'tj', last: 'holowaychuk' } } ]
      });

...am I right in understanding that the reason why a second object is added to the outermost array is because the 'first' property is repeated?

So, i.e., I could write a form like...

<input name="users[][name][first]" type="text"/>
<input name="users[][name][last]" type="text"/>
<input name="users[][name][first]" type="text"/>
<input name="users[][name][last]" type="text"/>

...and I should end up with a data structure like...

{ users: [ { name: { first: '...', last: '...' } }, { name: { first: '...', last: '...' } } ] }

...?

eleith commented 13 years ago

@alimanfoo, yes your understanding is correct.

once you repeat a property using [], that means a new object will be added to the array. if you did not want this, you should use numerical indices like [1] to specifically refer to where you want to append the object to.

is there a reason you closed the pull request? i was hoping it would get pulled into visionmedia's main branch...

tj commented 13 years ago

re-opened :)

alimanfoo commented 13 years ago

@eleith very clever, this is exactly what I was looking for, +1 to pull this into the main branch.

Sorry, I think I closed it by accident.

tj commented 13 years ago

sounds like it's about time to merge over and have a look

eleith commented 13 years ago

one thing i didn't test for is performance, so that might be worth looking into just to make sure things didn't get 10x slower =]

tj commented 13 years ago

might be a little slower, but still similar to the old node one that supported nesting:

    old simple: 474ms
    old nested: 1412ms

    new simple: 340ms
    new nested: 1381ms
tj commented 13 years ago

JSON.stringify(obj) == '{}' :p that's bad, I'll patch it up a bit

eleith commented 13 years ago

agh. you spoiled my party =[

tj commented 13 years ago

not sure i get this chunk ('0]' == part && JSON.stringify(obj) == '{}') looks pretty opinionated, and also if you remove the stringify part the tests still pass

tj commented 13 years ago

added the extended branch that we can work in, and reverted master to 0.1.0

eleith commented 13 years ago

i was testing exclusively for an empty hash, didn't want to continue on otherwise because i thought there was an edge case there...

eleith commented 13 years ago

let me clarify, the ('0]' == part && JSON.stringify(obj) == '{}') was to check if nothing had been created yet, then to assume we are building an array.

otherwise '0' could be just an index into a hash.

tj commented 13 years ago

ah, gotcha

eleith commented 13 years ago

jquery has an 'isEmptyObject' function:

function (a){for(var b in a)return!1;return!0}

tj commented 13 years ago

or Object.keys(obj).length, probably faster

eleith commented 13 years ago

beautiful