mahmoud / glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
https://glom.readthedocs.io
Other
1.89k stars 61 forks source link

[s1, s2, s3, ...] as spec should be equivalent to [(s1, s2, s3, ...)] #220

Open mathrick opened 3 years ago

mathrick commented 3 years ago

As far as I can tell, the behaviour of a [...] spec (in Auto mode) with more than one element in the list is not documented anywhere in the docs, and it certainly isn't very friendly. I continuously bump into this, and then spend a lot of time debugging the problem, before I realise what the issue is.

The current behaviour seems to be that for a spec of [s1, s2, s3, ...], anything beyond s1 will be silently ignored and have absolutely no effect. That is both unexpected, and very unfriendly to the user, because it makes the debugging experience mystifying. Especially since Inspect() will also get ignored, ie.:

target = [{'val': str(i**2)} for i in range(10)]

# For each element, get 'val', parse it as int, and get half of the value
glom(target, ['val', int, T / 2])
>>> ['0', '1', '4', '9', '16', '25', '36', '49', '64', '81']

# OK, let's try debugging it
glom(target, ['val', Inspect(), int, T / 2])
>>> ['0', '1', '4', '9', '16', '25', '36', '49', '64', '81'] # Why didn't Inspect() kick in?

glom(target, [Inspect(), 'val', int, T / 2])
    ---
    path:   [0, Path()]
    target: {'val': '0'}
    output: {'val': '0'}
    ---
...
>>> [{'val': '0'},
     {'val': '1'},
     {'val': '4'},
     {'val': '9'},
     {'val': '16'},
     {'val': '25'},
     {'val': '36'},
     {'val': '49'},
     {'val': '64'},
     {'val': '81'}]
# WTF, now the return value changed?

Is there a reason it isn't equivalent to [(s1, s2, s3, ...)], and could it be changed to be equivalent? If not, at the very least it should throw an error, but I really think it should be allowed and meaningful.

kurtbrose commented 2 years ago

great question :-) early on we couldn't decide what the correct behavior should be;

agree, we should throw an error rather than silently ignoring the later items

one example, we don't want the behavior to be EXACTLY the same as [(s1, s2)] is that if s2 evaluates to SKIP the list iterator wouldn't see the SKIP

if we can talk through it more and convince ourselves that there isn't a better purpose for list-of-specs than to basically have tuple-like behavior we can give it that behavior