pydantic / pydantic-core

Core validation logic for pydantic written in rust
MIT License
1.42k stars 237 forks source link

Expose ability to use "JSON mode" in the call to `validate_python` #712

Open dmontagu opened 1 year ago

dmontagu commented 1 year ago

Copying from slack:

On a related note: I think we should have a way to perform JSON-mode validation even when validating from python. Is that currently possible? Otherwise I don't think there's any way to do JSON-style validation of data transported via other protocols, short of doing something like model_validate_json(json.dumps(loaded_cbor_data))

It looks like this isn't currently exposed in pydantic-core, but I'd like to expose it through a keyword argument mode: Literal['python', 'json'] to model_validate which is passed directly through to SchemaValidator.validate_python — the mode is currently just hardcoded as InputType::python there but I don't see any reason we couldn't make it user-specifiable.

In particular, I think this will be handy for doing strict-mode validation of data that didn't have its origin in python (e.g., data loaded from a .yml file or some other serialization format into python objects).

Selected Assignee: @davidhewitt

DanielRosenwasser commented 7 months ago

In particular, I think this will be handy for doing strict-mode validation of data that didn't have its origin in python (e.g., data loaded from a .yml file or some other serialization format into python objects).

Yup - and also sometimes the way an application/library is structured just means that parsing happens earlier.

Is this something you all would take a PR for? Would it just be a new keyword param called mode?

davidhewitt commented 7 months ago

On reflection for this, I would like to suggest that instead of "JSON mode" we call this "raw mode", i.e. mode='raw', because it only takes a limited set of "raw" datatypes. Functionally it would work like JSON parsing currently does, I guess.

Not easy to implement, but I agree would be nice.

DanielRosenwasser commented 7 months ago

So one way I would try to approach this is to say that a set of Python types has a correspondence to JSON-equivalent values cannot be converted:

~~Outdated~~ Python Type | Raw Category ------------|------------- int | number float | number bool | boolean str | string None | null dict | object list | array

And under strict/raw, if a runtime type is in the list, see if the current validator expects a type that has an appropriate corresponding type in the list. If one exists, then no coercion may take place before a validator runs; otherwise non-strict coercions can take place.

I know this glosses over a lot of details of how validators actually work (e.g. string validators that expect a precise data format).

Does that match your intuition?

DanielRosenwasser commented 7 months ago

On reflection for this, I would like to suggest that instead of "JSON mode" we call this "raw mode", i.e. mode='raw', because it only takes a limited set of "raw" datatypes

For what it's worth, the current behavior really does assume something JSON-y as an input format.