HackSoftware / Django-Styleguide

Django styleguide used in HackSoft projects
MIT License
5.12k stars 520 forks source link

Idea: Generic data validator based on TypedDict #103

Closed bo5o closed 2 years ago

bo5o commented 2 years ago

Hi,

first of all, thank you for this great guide, it has helped me a lot.

Recently, I wanted to add some type hints for data returned by DRF's serializers, which I am using to serialize input data in my API views. I wanted to keep using DRF serializers and not switch to any other library (like pydantic or similar).

Since the validated_data of a DRF serializer is a dictionary, I thought adding types in form of Python's TypedDict would be a good idea. So I started adding type declarations for my serializers like this:

from rest_framework import serializers

class FooSerializer(serializers.Serializer):
    foo = serializers.IntegerField(min_value=1, max_value=10)
    bar = serializers.CharField(required=False)

from typing_extensions import TypedDict, NotRequired

class FooDict(TypedDict):
    foo: int
    bar: NotRequired[str]

In my API views I cast the validated data of the serializer to the propert type.

from typing import cast
from rest_framework.views import APIView

class MyApi(APIView):
    def put(self, request):
        serializer = FooSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        validated_data = cast(FooDict, serializer.validated_data)
        # ... call services etc.
        return Response(...)

It works pretty well, but after some time I noticed some problems. I often noticed slight differences between the types I declared and the actual output of the serializer, especially for big, complex serializers, where it is easy to overlook some detail, or miss some field in the type declaration. I wanted something that checks if the output of the serializer is actually assignable to the type I had declared for it (e.g. during unit testing), so that differences can be spotted easily and early.

I found trycast which provides trycast and isassignable functions that will check if an object is actually assignable/castable to the provided type. My API views now looked like this

from typing import cast
from trycast import isassignable
from rest_framework.views import APIView

class MyApi(APIView):
    def put(self, request):
        serializer = FooSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)

        assert isassignable(
            serializer.validated_data, FooDict
        ), f"Validated data is not assignable to type {FooDict}"

        validated_data = cast(FooDict, serializer.validated_data)
        # ... call services etc.
        return Response(...)

Applying this to a lot of views is very repetitive, so I ended up making a generic data validator for it. Due to some limitations in the type system (see https://github.com/python/mypy/issues/9773), the implementation is a little hacky. I adapted the suggestion from this comment and came up with the following

_TD = TypeVar("_TD", bound=Mapping)

class DataValidator(Generic[_TD]):
    """Validate data using DRF serializer and cast validated data to a typed dict.

    Adapted from https://github.com/python/mypy/issues/9773#issuecomment-1058366102

    Parameters
    ----------
    serializer_class : rest_framework.serializers.Serializer
        DRF serializer to use for validation.

    Examples
    --------
    >>> class FooSerializer(serializers.Serializer):
            foo = serializers.IntegerField(min_value=1, max_value=10)
            bar = serializers.CharField(required=False)
    >>> class FooDict(TypedDict):
            foo: int
            bar: NotRequired[str]
    >>> validator = DataValidator[FooDict](FooSerializer)
    >>> validated_data = validator.validate({'foo': 9})
    """

    def __init__(self, serializer_class: type[Serializer]):
        # see class returned by __class_getitem__ for implementation
        pass

    def validate(self, data: dict) -> _TD:
        """Validate data.

        Returns the validated data properly typed.

        Raises
        ------
        rest_framework.serializers.ValidationError
            raised if validation failed
        AssertionError
            raised if output of serializer is not assignable to given type
        """
        # see class returned by __class_getitem__ for implementation

    def __class_getitem__(cls, type_):
        """Return a class that has access to generic type at runtime."""

        class _DataValidator(DataValidator):
            _type = type_

            def __init__(self, serializer_class: type[Serializer]):
                self.serializer_class = serializer_class

            def validate(self, data: dict) -> _TD:
                serializer = self.serializer_class(data=data)
                serializer.is_valid(raise_exception=True)

                assert isassignable(
                    serializer.validated_data, self._type
                ), f"Validated data is not assignable to type {self._type}"

                return cast(_TD, serializer.validated_data)

        return _DataValidator

Finally, the API views look like this.

class MyApi(APIView):
    def put(self, request):
        data_validator = DataValidator[FooDict](FooSerializer)
        validated_data = data_validator.validate(request.data)
        # ... call services etc.
        return Response(...)

I thought I might leave this here for anybody else looking to add types to DRF serializers.

RadoRado commented 2 years ago

@cbows Thanks for sharing that!

I'll leave it open, until we find time to go into more details, in case someone finds this helpful.

Cheers