Open rmorshea opened 6 years ago
Hmm, this is interesting. How would it work for disparate types, such as a list and a dict, or an int and a string?
@skorokithakis I think the composition of schema types could be handled in three cases:
compose
.function(*args)
(And
by default).@skorokithakis are you planning to support Python 2.7 in future releases?
That's up to @keleshev, although my preference would be to stop supporting it sooner rather than later. About this PR, I'm worried that the result would be a bit too hard for people to understand. For example, there's nothing symmetric about chaining schemas of iterables together and merging keys in dictionaries (not to mention that dictionaries are iterables too)...
@skorokithakis for clarity, when I referred to "iterables" I meant it in the way that _priority
defines it. To be more precise I'll call them collections now. Likewise I'll refer to dictionaries as mappings. When I talk about these classifications I mean them to be exclusive (i.e. an object is a mapping or collection, but not both).
back to business...
I would agree that the handling of collections is up for debate. I'm not sure whether they should be merged into one collection, or whether they should be passed into a join_collections
function which should merge them in whatever way the user decides (where the default behavior would be the former).
With that said, I think the the following behaviors are relatively intuitive:
join(*schemas)
function which defaults to And
.Hmm, yes, it's certainly better, but I'm worried about the lack of consistency between 1 and 2 (they basically do completely different things. Also, 2 is rather more convoluted than what we have now, where you can just And
two schemas anyway, so the main benefit of this is a function that merges the keys in N collections.
That does seem useful, but I'm worried that it's possibly not useful enough to have as a core piece of the library... What are your thoughts on this, @rmorshea? I'm not entirely certain myself.
@skorokithakis I’m pretty confident that the ability to recursively merge keys is important for building more complex systems of validation.
For example, consider my present use case...
I must validate JSON responses from a server. All of the possible responses have nested data. Furthermore all the responses share a common set of nested fields. Currently there is no way to create a base schema which can be extended into all the possible response cases.
My particular use case seems like it would be pretty common.
Hmm, true. Maybe a better approach would be a way to "include" a Schema collection's keys into another Schema?
I don’t think that would work in my use case:
I have a common field “data” which I know is a string. However in my extension I would like to be able to specify that this field is a string of a particular form. To accomplish this I would want to merge the common type specification and my custom validator under an And operator.
I’ll see if I can come up with some specific example when I get home.
Ah, I see what you mean. Yes, what you are describing is a specific extension of the schema, which I agree is valuable, but I don't think the compose
method is the best way to do it... What you are describing isn't just a straightforward way to compose two schemas, but it also contains a rather opinionated method for doing that, out of all the alternatives. I wonder if there's a lower-level way we could achieve the same result with more flexibility...
So long as uses have the ability to customize the logic, I don't think there's much harm in having opinionated default behavior if that default behavior is intuitive.
I'm also not really sure what you mean by "lower-level". Could you given an example, or describe this further?
In the end, I need to be able to use something like compose
otherwise I'll have to use a library like marshmallow
because it enables this kind of extension/composition via inheritance. I would prefer to use schema
though, and I think that compose
would simplify much of what I would otherwise have to do with marshmallow
.
By "lower-level" do you mean that users ought to have "finer" control over the merging behavior?
Or are you imagining that this logic could be more deeply embedded such that you could add schemas?
s1 = Schema({"a": {"b": int}})
s2 = Schema({"a": {"c": int}})
s1_and_s2 = s1 + s2
@skorokithakis I think the following solution is pretty clever.
What if schema composition were handled in two cases:
compose
on order passed to compose
.reduce=<function or validator>
:
reduce(schemas)
where schemas
is a listvalidate(schema)
attribute.The default behavior of 2 is not opinionated, and the optional reducer is infinitely extensible. It also makes it possible for the schema
library to develop builtin reducers as people discover useful patterns and suggest that they be added.
s1 = Schema({"a": int, "b": int})
s2 = Schema({"b": float})
s3 = compose(s2, s1) # note choice of order
assert s3.is_valid({"a": 1, "b":, 2.0})
s1 = Schema(str)
s2 = Schema(lambda s: s.lower() == s)
and_reducer = Use(lambda schemas: And(*schemas))
s3 = compose(s1, s2, reduce=and_reducer)
assert s3.validate("hello world!")
assert not s3.is_valid("Hello World!")
import functools, operator
s1 = Schema({"a": [int], "b": str})
s2 = Schema({"a": [float], "b": lambda s: s.lower() == s})
join_lists = lambda schemas: functools.reduce(operator.add, schemas, [])
list_reducer = And([list], Use(join_lists))
reducer = Or(list_reducer, and_reducer)
s3 = compose(s1, s2, reduce=reducer)
assert s3.is_valid({"a": [1, 2.0], "b": "hello world"})
There's probably a way to make developing reducers easier, but this seems really powerful!
Hey guys, are we doing this? Would the implementation allow for nested schemas as well (that's the feature I need)?
Sorry, I just now noticed that I haven't replied to this. I will address this shortly.
By "lower-level" do you mean that users ought to have "finer" control over the merging behavior?
@rmorshea Yes, basically I am worried that composing right now gives no control to the user, it is a set of pre-written rules for how things will be composed and that's it. If they don't fit the user's use case, there isn't much they can do about it.
There's probably a way to make developing reducers easier, but this seems really powerful!
This does seem powerful, I like it! I think it's nearly there, my only worry is that we need to break down the rules a bit further. For example, what happens if we're composing:
b.update(a)
)We can possibly just throw errors for most of these, or pick the first, or let the user specify a composer, I'm just looking to better understand how this would work.
It would be great if it were possible to compose two schemas together:
or
If such a proposal is reasonable and possible I am willing to create a PR. Suggestions are welcome!