mahmoud / glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
https://glom.readthedocs.io
Other
1.89k stars 61 forks source link

Match with callables is undocumented and inconsistent with the rest of glom #217

Open mathrick opened 3 years ago

mathrick commented 3 years ago

In Auto mode:

>>> glom(42, lambda x: 13)
13
>>> glom(42, Spec(lambda x: 13))
13
>>> glom(42, Invoke(lambda x: 13).specs(T))
13

In Match mode:

>>> glom(42, Match(lambda x: 13))
42
>>> glom(42, Match(Spec(lambda x: 13)))
42
>>> glom(42, Match(Invoke(lambda x: 13).specs(T)))
13

This is not currently documented (Match docs say nothing of callables, although in practice they are treated specially and the behaviour is "if the callable returns a truthy value, accept and return the target as-is, ignoring the return value"), and makes it awkward and surprising to use callables to transform the value being matched, a'la schema. It also represents a counter-intuitive departure from the way specs normally operate, and means that lambda x: ... and Invoke/Call operate very differently. It took me a while to figure out why the Match example with Val() would return a different value than the target, but lambda x: ... stubbornly passed it through unchanged, and I only understood it after reading the source.

kurtbrose commented 3 years ago

We went back and forth a TON on the fine details of this behavior in Match.

Regarding schema -- this is actually the same behavior for callables. In schema you need to explicitly wrap your lambda with Use

From schema PyPI page:

>>> schema = Schema([{'name': And(str, len),
...                   'age':  And(Use(int), lambda n: 18 <= n <= 99),
...                   Optional('gender'): And(str, Use(str.lower),
...                                           lambda s: s in ('squid', 'kid'))}])
kurtbrose commented 3 years ago

To address Invoke vs lambda -- in general the idea is that explicit specs (e.g. Invoke) always have the same behavior regardless of mode. In some sense, Match as a mode is providing an alternative set of implicit behaviors (on dict, list, str, etc).

If we were to say that Invoke == lambda always, then that means modes would be unable to provide alternative definitions.

In this particular case, we've used schema before and liked it and wanted to embed schema capabilities in glom; so wherever possible we provided the same API affordances. (In fact, I went through and replaced a bunch of schema with glom.Match at my day job; they are so close that it went very smoothly).

kurtbrose commented 3 years ago

I don't want to be discouraging -- it is really cool to see someone engaging so deeply. Also, it is EXTREMELY helpful to hear where the docs fell over and led you astray. I think at a minimum this issue gives us a few places where we need to improve the docs.

Totally open to having a long discussion about Invoke, lambda. I just bring up that we spent months and months talking this stuff to death that there's unlikely to be a change in behavior. But, maybe you've got a fresh perspective and are seeing something we're missing.

mathrick commented 3 years ago

That's true, it does match schema, but schema documents it, and also schema doesn't need to match the rest of glom :). I do see the value of being able to change the implicit behaviour, I'm just a bit concerned that it deviates without a clear need here. If I were to write a match mode from scratch, I'd probably have something like Assert(callable) for this, and leave plain lambda alone. But if it's at least documented and actually intended to be an almost drop-in replacement of schema, that does provide some context. The docs also don't make it clear, they just mention it should feel "familiar", not "basically identical" :)

BTW, I really like the Match mode, and I just ported my validation code from schema to Match. Originally I started with schema for validation and glom for manipulation, and then I discovered Match was added since the last time I used glom for work. It was remarkably easy to adapt and after I got all the definitions ported over, it worked immediately, which is great. I love the fact it's actually possible to have custom exceptions without much fuss; I had to create a fork of schema to be able to say "fail the validation immediately and also communicate to the caller that the whole object should be rejected". That glom also wraps it in a branching traceback without changing how except interacts with it is pure magic!

Also, since this is the second time I'm writing a validation-and-manipulation layer on top of glom for work, perhaps I should use this opportunity to kickstart a discussion on how to make it easy in glom. The basic idea I want to express is a sort of combined language to be able to say "ensure that this part of the input has that shape, and if it's passed, transform it as follows and distribute the result into those parts of the output". Last time I wrote a custom pattern matcher, and used Check, since that was before Match was a thing. The pattern matcher essentially flipped the way glom specs work around, much as Match does, but also interpreted paths as as description of where in the output the value should be placed (or SKIP if it weren't interesting). Now Match makes the matching-validation part fairly easy, but it's still basically disjointed from transforming the input, so you still have to write a separate spec to distribute it into the shape you want the output to have. Does this sound like an interesting discussion to you? I realise it's far easier to describe the kind of "do what I mean" functionality than it is to specify how it's going to work in practice here, but that's exactly why I'd like to discuss it :)