pydantic / pydantic-core

Core validation logic for pydantic written in rust
MIT License
1.44k stars 243 forks source link

Allow validators to keep state during validation #432

Open adriangb opened 1 year ago

adriangb commented 1 year ago

From https://github.com/pydantic/pydantic-core/pull/430#issue-1619909885

Currently we do a first pass in strict mode for unions and then a pass in lax mode. If we moved forward with something like #430 we'd be adding a third pass.

To avoid the cost of re-doing work we already did, we should let validators pass some sort of "state" or "token" up to their caller. The caller can then call back into them with that thing and they can resume from where they left off.

For example, a list validator might know that that it made it up to the 100th item in strict mode, so it can just start off from the 100th item in lax mode.

A string validator might know that strict mode and lax mode are the same as far as it's concerned so if the thing it's validating failed the first pass in strict mode it doesn't need to try again in lax mode, it can just fail fast.

One way to approach this would be to have:

trait Valdiator<ValStateTokenType> {
  type ValStateToken;
  fn validate(..., state: Option<ValStateTokenType>) -> (Option<ValStateTokenType>, ValResult<...>);
}

And then validators that don't use this can be:

impl Validator for SomeStatelessValidator {
    type ValStateToken = ();
    ...
};

Or:

struct ValidatedIndex { idx: usize, strict: bool };

struct ListValidator {};

impl Validato for ListValidator {
    type ValStateToken = ValidatedIndex;
    ...
};
adriangb commented 1 year ago

Unfortunately this won't be as simple as adding a generic. Since we're using a big old enum with enum_dispatch the trait can't have any generics or associated types. I think the only thing we can do here is add another level of indirection:

BuildValidator.build() -> Validator.prepare() -> StatefulValidator.validate()

or

BuildValidator.build() -> Validator.copy() -> Validator.validate()

Validators that don't need any state could just return Self from prepare()/copy()