marshmallow-code / marshmallow

A lightweight library for converting complex objects to and from simple Python datatypes.
https://marshmallow.readthedocs.io/
MIT License
6.99k stars 624 forks source link

RFC: Improved API for registering field validator methods #367

Closed sloria closed 4 years ago

sloria commented 8 years ago

Status quo

Validator methods are registered by decorating them with @validates('my_field_name').

from marshmallow import fields, Schema, ValidationError, validates

class ItemSchema(Schema):
    quantity = fields.Int()

    @validates('quantity')
    def validate_quantity(self, value):
        if value > 30:
            raise ValidationError('Quantity must not be greater than 30')

Proposed API

Validator methods are registered by decorating them with @my_field.validator.

from marshmallow import fields, Schema, ValidationError

class ItemSchema(Schema):
    quantity = fields.Int()

    @quantity.validator
    def validate_quantity(self, value):
        if value > 30:
            raise ValidationError('Quantity must not be greater than 30')

Benefits

justanr commented 8 years ago

:+1: Less magic is always good.

jmcarp commented 8 years ago

Less prone to user error (no magic strings).

:guitar:

sloria commented 8 years ago

A sketch of this is implemented in #368.

immerrr commented 8 years ago

Less magic strings is good, so it seems a nice idea to replace @validates('quantity') with @validates(quantity).

I would however like to point out that schema-level validators may check how two or more fields interact with each other. A recently introduced feature allows skipping schema-level validation if the validation of the field itself failed, it can be generalized to multiple fields, too. There's not much sense in checking how two, say, integer parameters interact with each other if one of those is unexpectedly a dictionary. Current API could support that naturally with @validates([quantity, quality, speed]), but it is not that obvious to me how to do that with the proposed one.

taion commented 8 years ago

I think @immerrr's proposal provides a good middle ground between the current API and the proposal – it avoids the issue of using a string for no clear API reason, but sidesteps some of the problems noted with #368.

deckar01 commented 8 years ago

I was leaning towards @validates(quantity) for consistency, but @quantity.validator would simplify the implementation considerably.

A recently introduced feature allows skipping schema-level validation if the validation of the field itself failed ... Current API could support that naturally with @validates([quantity, quality, speed]) ...

@immerrr Since the @validates_schema decorator already exists, I'm not sure that overloading the @validates decorator with multiple fields is necessary. It would explicitly list field dependencies which would allow for more fine grained skipping, but the optimization of an optional validation feature is probably less important than usability of the primary field validators interface. If this functionality was desired it might be more appropriate to allow declaring field dependencies on the @validates_schema decorator.

@sloria For consistency, would it be feasible to replace the @validates_schema decorator with something like @ItemSchema.validator to match @quantity.validator?

taion commented 8 years ago

To make sure we're not missing anything here, do we know why SQLAlchemy uses e.g. @validates('email')? http://docs.sqlalchemy.org/en/latest/orm/mapped_attributes.html

deckar01 commented 8 years ago

@validates methods eventually get wrapped in the utils._validator_events() event handler.

Deferring attribute resolution to the validator seems to implicitly allow overriding inherited attributes without overriding their validators, which I'm not sure is desirable.

I have a suspicion that this API is an artifact of it being easier to pass the model instance and field name to the validator method than trying to resolve a model class's column to a model instance's column (which marshmallow already does).

For historical context, https://github.com/zzzeek/sqlalchemy/commit/3829b89 contains the original implementation of the @validates decorator. As far as I can tell, passing the attribute name was the way an example suggested doing it, then it was copied to a core feature.

taion commented 6 years ago

@sloria

Is this still desirable for v3.0, or is this too big a change?

In other words, by https://github.com/marshmallow-code/marshmallow/pull/368#issuecomment-294360133, do you mean that you think this is not something that should be done, or that the specific approach there was not a good investment?

I think the issues around inheritance could be adequately resolved with either the @field.validator or the @validates(field) (instead of @validates('field_name')) approach.

I ask because we're looking to start moving to Marshmallow v3 shortly, and would like to contribute code to get as many of the breaking changes out of the way first.

taion commented 6 years ago

@deckar01

To my earlier question, I think a more DX-y answer for why that API is "nice" is because the columns aren't nicely accessible when using inheritance.

It's a little awkward to write something like:

class Child(db.Model):
    @Parent.column.validator()
    def validate_column(self, value)
        # Do whatever.

Something like @immerrr's suggestion scans a bit better, but is still slightly clunky:

class Child(db.Model):
    @validates(Parent.column)
    def validate_column(self, value)
        # Do whatever.
sloria commented 6 years ago

@taion

In other words, by #368 (comment), do you mean that you think this is not something that should be done, or that the specific approach there was not a good investment?

I'm still open to adding that feature, but I don't consider a blocker to releasing 3.0 final, esp. because it can be done in a backwards-compatible way. We're more focused on getting the breaking changes done so that we can finally get 3.0 out the door.

taion commented 6 years ago

Is the proposed @validates(field) syntax acceptable, so as to keep things more consistent with how they work right now, while still improving DX?

sloria commented 6 years ago

@taion Yes, I think it's a fine solution.

deckar01 commented 6 years ago

@taion https://github.com/marshmallow-code/marshmallow/issues/367#issuecomment-388634605 seems to be an argument for not doing this at all. A schema subclass would modify the validation of its parent class's fields if validation registration was deferred to the field instances. That means we would need to keep the schema level field validation registry.

taion commented 6 years ago

@deckar01 In my reading of the proposal per @immerrr, the syntax above would just be a shorthand – we would read the field name or attribute, and use that instead of the string value. So in any event, instead of storing values on the field itself, we'd keep the current semantics of storing things on the method itself.

sloria commented 4 years ago

Discussed offline: it's not clear that the OP was a good idea. This API gets confusing when fields are inherited. validates('string') looks magic on the surface, but at least it's clear what its behavior is.

Closing this for now.