pyeve / cerberus

Lightweight, extensible data validation library for Python
http://python-cerberus.org
ISC License
3.17k stars 240 forks source link

Cerberus 2: Proposal to make handlers 1st class 'citizens' / decouple rules etc. from validator classes #372

Closed funkyfuture closed 1 year ago

funkyfuture commented 6 years ago

This is a proposal for the next major release of Cerberus that will require users to refactor their custom validators as no backward compatibility is intended.

The current mechanics to extend a Validator respectively the basic mechanics of Cerberus' dispatching of appropriate implementations of rules, ~type checks~ (see #374), default setters, validators and coercions relies on the structure of callable's names that are bound to a validator class. Here's an example that will be used as reference in the following discussion:

class MyValidator(cerberus.Validator):

    def _normalize_coerce_bar(value):
        pass

    def _validate_foo(self, constraint, field, value):
        pass

    def _validate_validator_foo(field, value, error):
        pass

This is generally working well, but it has a few minor drawbacks.

As alternative I propose to decouple the handlers from the validators by making rule implementations etc. first level 'citizens' that can then be used to assemble a validator's functionality:


@coercer()
def bar(value):
    pass

@validation_rule()
def foo(validator, constraint, field, value):
    pass

@validator(name='foo')
def foo_the_validator(validator, field, value):
    pass

class MyValidator(cerberus.Validator):
    wanted_handlers = (bar, foo, foo_the_validator)

This allows reuse of handlers without the need to wrap them in classes in the first place that will then be used as mixin-classes. Grouping of handlers can still be achieved with simple sequence types. But flexibilty is added, a handler can easily be referenced in different groups.

One point that comes up repeatedly with the current design is the confusion of the prefixes _validate and _validate_validator. While the latter seems redundantly named, both names are hardly distinct. This isn't completely solved in the proposal above as both categories still stem from valid and hey, isn't this library all about it? Well, at least the entity that can be used as rule is marked with that term.

The example also shows that handlers can be assigned with explicitly given names that ought to be used in schemas.

Admittedly, this one's a rare case; that one can't really remove such handler in custom validators. To achieve that one would imitate the absence of that handler in its implementation. Now, with functions that are first level 'citizens', one can simply point at them:

class MyOtherValidator(MyValidator):
    unwanted_handlers = (foo_the_validator,)

Another circumstance that can be improved are rule's dependencies that must currently be defined statically for any validator (e.g. w/ Validator.priority_validations), decoupled from the rules themselves. Annotated functions can do better:


@validation_rule()
def bar(…):
    pass

@validation_rule(after=bar)
def foo(…):
    pass

Similarly, other aspects like mandatoriness can be annotated.

Another annotation would be useful for rules, coercers and validators that are supposed for certain types:

string_type = TypeDefinition('string', (typing.AnyStr), ())

@coercer(type_filter=string_type)
def tokenize(value):
    return value.split(' ')

@validation_rule(type_filter=string_type)
def regex(validator, constraint, value):
    if not re.match(constraint, value):
        validator.error(…)

Finally, a user can use all editor/IDE-fanciness when writing schemas for a rule's constraint:

@validation_rule(constraint_schema={'type': 'string'})
def regex(…):
    pass

Currently a user must write these annotations into docstrings that will are eventually parsed to dictionaries by Cerberus.

The implementation will take advantage of the fact that functions are objects and thus arbitrary data can be bound to them as properties and further rely on metaclasses, not much different than now.

Paradigmatically this shifts validators from implementing containers ro rather describing containers (while still implementing the core).

A 2to3-like tool that dumps a source file with the transformed methods of a Validator seems not too complicated, but may also result in more effort than some find and replace action.


1st addendum: it should be investigated whether declaring different handlers for the same rule name to be applied in different processing phases is a viable option. a use at hand would be the readonly rule, hence that investigation must turn out positive for the whole story to succeed.

nicolaiarocci commented 6 years ago

Nice design, but I wonder if it would be worth the effort. The new API would totally diverge from current one, making Cerberus 2 a different product. I wonder how many users would be happy to refactor so heavily. My guess is that most of them would just stay with 1.x. The old adagio "if something works fine, leave it alone" would probably apply here.

Feedback from current users would be valuable here. Anyone?

funkyfuture commented 6 years ago

The new API would totally diverge from current one, making Cerberus 2 a different product.

i wouldn't say that Cerberus would significantly differ by that change. the effort needed to adjust the extended validators wouldn't be much different than the one one had to apply to the changed type checks from 0.x to 1.x.

manually it would require these steps:

as mentioned, automating this seems to be a viable option.

of course, the decision whether the necessary upgrading effort is acceptable (and whether a tool for that would be advisable) can finally only be ruled out with a working implementation.

anyway, it's clear that Cerberus 1.x can and should be supported for a while with bugfixes and there shall be no necessity to upgrade to 2.x.

Feedback from current users would be valuable here. Anyone?

shall we call out to the community for feedback regarding the roadmap when i finished to detail the remaining proposals? you the twitter me the reddit?

jacek-jablonski commented 5 years ago

Is it still considered to be implemented soon?

funkyfuture commented 5 years ago

no, neither is decided yet whether such feature will be merged into master (but i'm still interested in implementing it as proof-of-concept), and certainly not soonish. input from your experience regarding any apsect of such decoupling is certainly welcome.

funkyfuture commented 5 years ago

i'm adding the 2.0 milestone here while the inclusion of a proven implementation of the proposal remains undecided (but we have to decide before a 2.0 release).

funkyfuture commented 1 year ago

closing this issue as there are currently no intentions to continue a next major release.