jmespath / jmespath.py

JMESPath is a query language for JSON.
http://jmespath.org
MIT License
2.19k stars 181 forks source link

Can't flatten sub-sub-lists? #92

Closed joelthompson closed 9 years ago

joelthompson commented 9 years ago

Let's say I have this list: [[1, 2, 3, [4]], [5, 6, 7, [8, 9]]

I want to extract this output: [[1, 2, 3, 4], [5, 6, 7, 8, 9]]

It seems like this query should work: [*][], i.e., "project the input into a list, and flatten each element of the outer list" but it doesn't work. I get [1, 2, 3, [4], 5, 6, 7, [8, 9]], which is the same as if I had just passed in []. Oddly, [*][0] does return what I would expect, [1, 5] which the first element of each element of the outer list. Why is it that in the [*][0] expression the [0] operates on each element of the top-level list, while in [*][] the [] seems to operate on the list as a whole? I would expect that behavior out of [*] | []. Similarly, [0][] returns [1, 2, 3, 4] and [1][] returns [5, 6, 7, 8, 9].

I see the same behavior both on the released 0.7.1 and the current develop branch.

Thanks!

--Joel

joelthompson commented 9 years ago

Per the spec (emphasis added):

A wildcard expression is a expression of either or []. A wildcard expression can return multiple elements, and the remaining expressions are evaluated against each returned element from a wildcard expression. The [] syntax applies to a list type and the syntax applies to a hash type."

jamesls commented 9 years ago

I agree that it should be possible to do what you want, but right now there currently isn't a way, unless you specifically call out each element in the list, i.e. [[0][], [1][]]

What's going on here is that the precedence of the flatten operator [] is really low. So normally in a wildcard expression you'd have: left-hand-side[*]right-hand-side, e.g foo[*].bar. However it's possible for both the LHS and RHS to be empty: [*].bar. foo[*], which internally parse as: Wildcard(empty, Field('bar')), and Wildcard(Field('foo'), empty) respectively. It's also possible for both LHS and the RHS to be empty, [*] -> Wildcard(empty, empty).

With the flatten operator, what's happening is both the RHS and LHS of the projection are being parsed as empty because the flatten operator has such a low precedence binding, and the whole result is being flattened, i.e Flatten(Wildcard(empty, empty)). In order to get what you want, we'd need to be able to parse to Wildcard(empty, Flatten()) which isn't possible right now. Perhaps there's some syntax we can introduce to make this possible.

btw, if you're curious, you can see the AST these expressions parse to using the jp.py --ast option, which is included when you pip install jmespath.

Marking as a feature request.

cc @mtdowling

jamesls commented 9 years ago

Proposed via in https://github.com/jmespath/jmespath.site/issues/15, so if it's accepted would mean you could accomplish what you want via: map(&[], @).

jamesls commented 9 years ago

Now available via map()

joelthompson commented 9 years ago

Great, thanks!