dandi / dandi-schema

Schemata for DANDI archive project
Apache License 2.0
5 stars 8 forks source link

Use full information from `ValidationError` to form error messages #246

Open waxlamp opened 1 week ago

waxlamp commented 1 week ago

This gist demonstrates how Pydantic validation errors can indicate the exact value within the "tree" of values caused a failure, its location, and a machine-readable error type. This information could be used to immediately improve validation error messages (by just including the value and location).

A longer-term improvement would be to use the machine-readable tag plus the field name to "translate" "nerd" messages ("This string failed to match this regex") into "end user" messages ("The name field of a Person contributor should be in the form of <family name>, <given name> (some examples: Doe, Jane; Picard, Jean-Luc; Hornblower, Horatio").

I plan to send a proof-of-concept PR demonstrating one way dandischema could accomplish this.

This issue originated from https://github.com/dandi/dandi-archive/issues/713 and then more recently from https://github.com/dandi/dandi-archive/issues/1958. @mvandenburgh has filed #245 (and #244) to deal with Pydantic's default algorithm for validating Unions; this issue has more to do with improving error messages across the board.

candleindark commented 4 days ago

After messing around with ValidationError for a bit, it came to my knowledge that it has a json() method that outputs a str representation of the error. You can get the location of the error from the loc field.

from __future__ import annotations

from typing import Union, Optional

from pydantic import BaseModel, ValidationError, Discriminator, Tag
from typing_extensions import Annotated

def model_x_discriminator(v) -> Optional[str]:
    if isinstance(v, int):
        return "int"
    if isinstance(v, (dict, BaseModel)):
        return "Foo"

class Foo(BaseModel):
    x: Annotated[
        Union[Annotated[Foo, Tag("Foo")], Annotated[int, Tag("int")]],
        Discriminator(model_x_discriminator),
    ]
    y: int

try:
    Foo.model_validate({"x": {"x": {"x": 1}, "y": 1}, "y": 0})
except ValidationError as e:
    print(e.json())
    """
    [{"type":"missing","loc":["x","Foo","x","Foo","y"],"msg":"Field required","input":{"x":1},"url":"https://errors.pydantic.dev/2.7/v/missing"}]
    """

Though the above example involves an uncommon use of discriminator union, the json() method is available for all ValidationError.

I hope this is useful.