FelixTheC / strongtyping

Decorator which checks whether the function is called with the correct type of parameters.
https://pypi.org/project/strongtyping/
108 stars 3 forks source link

Subclass checks #94

Open hjalmarlucius opened 2 years ago

hjalmarlucius commented 2 years ago

It would be great if there were a function to check if defaultdict[str, float] is a subclass of dict[str, float]. Use case: Descriptors ensuring (done in __set_name__) that the typing of parent classes is Liskov Substition consistent with child classes.

FelixTheC commented 2 years ago

Could you please provide a test scenario for a better/more detailed understanding?

hjalmarlucius commented 2 years ago

An attempted solution using py3.10 but there's surely some holes:

from typing import get_args
from typing import get_origin
from typing import Any
from typing import Union

def check_subclass(child_type: Any, parent_type: Any) -> bool:
    if parent_type is Any or parent_type is child_type:
        return True
    parent_origin = get_origin(parent_type)
    child_origin = get_origin(child_type)
    if parent_origin is None:
        return issubclass(child_origin or child_type, parent_type)
    parent_args = get_args(parent_type)
    child_args = get_args(child_type)
    if parent_origin is Union:
        if child_origin is Union:
            return all(
                any(check_subclass(ch, pa) for pa in parent_args) for ch in child_args
            )
        return any(check_subclass(child_type, parent_arg) for parent_arg in parent_args)
    if child_origin is None:
        return False
    if not issubclass(child_origin, parent_origin):
        return False
    try:
        for childarg, parentarg in zip(child_args, parent_args, strict=True):
            if not check_subclass(childarg, parentarg):
                return False
    except ValueError:
        return False
    return True
hjalmarlucius commented 2 years ago

I prefer to build code using lightweight libraries like yours rather than more feature-rich systems like pydantic and typeguard. What I really want is a drop-in replacement for issubclass and isinstance that is compatible with mypy.

I've built a lightweight, pydantic-like descriptor that validates attributes, e.g.:

from numbers import Real
from numbers import Complex
from typing import Optional
from typing import Union
from hjarl.simulator import Entity
from hjarl.simulator import constant

class X(Entity):
    x = constant(dict[int, str])
    y = constant(Optional[int])
    z = constant(Union[Real, str])

class Y(X):
    x = constant(dict[int, str])
    y = constant(int)
    z = constant(Union[Complex, str])
#  ^ error on z since Complex is not compatible with Real
FelixTheC commented 2 years ago

So we need a separate decorator or new option/parameter for the @match_class_typing :thinking:

hjalmarlucius commented 2 years ago

I wasn't really thinking of the decorator (I only used utils the functionality) but only a function that can check (reasonably well) subclassing (and isinstance) from type hints. I haven't found any library that solves this pretty simple problem without also adding a ton of complexity (particularly pydantic).

FelixTheC commented 2 years ago

This is only an example without any tests and polish

from numbers import Real, Complex
from typing import Optional, Union

from strongtyping.types import Entity, Constant

class X(Entity):
    x: Constant[dict[int, str]]
    y: Constant[Optional[float]]
    z: Constant[Union[Real, str]]

class Y(X):
    x: Constant[dict[int, str]]
    y: Constant[int]
    z: Constant[Union[Complex, str]]

will raise at compile time AttributeError: strong_typing_utils.Constant[typing.Union[numbers.Complex, str]] does not match with same Attribute in 'X'

Will this solve the issue in the way you expect??

The overhead is quite small here is the current not ready for release pre-view

class Entity:

    def __init_subclass__(cls, **kwargs):
        if cls.__mro__[1] == Entity:
            return cls
        else:
            sub_cls = cls.__mro__[0]
            for parent in cls.__mro__[1:]:
                parent_annotations = parent.__annotations__
                for key, val in sub_cls.__annotations__.items():
                    if parent_val := parent_annotations.get(key):
                        if val != parent_val:
                            raise AttributeError(f"{val} does not match with same Attribute in {parent.__name__!r}")

                if parent.__class__ is Entity.__class__:
                    break
        return cls

this will be called later and do the checks

def check_typing_type(arg_typ, other_typ, *args, **kwargs):
    arg_origins = get_origins(arg_typ)
    other_origins = get_origins(other_typ)
    if arg_origins != other_origins:
        if 'Optional' in arg_origins or 'Optional' in other_origins:
            check = True
            possible_args = get_possible_types(arg_typ) or (arg_typ,)
            possible_other = get_possible_types(other_typ) or (other_typ,)

            for arg, other in zip_longest(possible_args, possible_other):
                if arg is not None and other is not None:
                    if arg is not other and other in ORIGINAL_DUCK_TYPES:
                        check = other in ORIGINAL_DUCK_TYPES[arg]
                        if not check:
                            break
                    else:
                        check = arg is other
                        if not check:
                            break
            return check
        return False
    else:
        check = True
        possible_args = get_possible_types(arg_typ) or (arg_typ,)
        possible_other = get_possible_types(other_typ) or (other_typ,)
        for arg, other in zip_longest(possible_args, possible_other):
            try:
                check = issubclass(arg, other)
            except TypeError:
                # continue with nested values
                pass
            else:
                if not check:
                    break
    return check
FelixTheC commented 2 years ago

@hjalmarlucius what do you think about it??

hjalmarlucius commented 2 years ago

I only reviewed the interface but think it's great - very similar to what I've built on my private system as well. However, I'm using descriptors instead of type hints since I felt that hints required a more hacky solution whereas descriptors had a very nice interface via __set_name__.

A question: Why do you match against the string representations instead of the types themselves? Ref my own check_type and check_subclass (both probably has some holes - I put it together long ago but haven't tested it extensively):

from typing import get_args
from typing import get_origin

custom_typechecks: dict[Any, Callable[[Any, tuple[Any, ...]], bool]] = {}

def check_type(tgt: Any, cls: Any) -> bool:
    if cls is Any:
        return True
    if cls is None:
        return tgt is None
    if cls is type:
        return type(tgt) is type
    args = tuple(arg for arg in get_args(cls) if not isinstance(arg, TypeVar))
    if (origin := get_origin(cls)) is None:
        return isinstance(tgt, cls)
    if origin in (Union, UnionType):
        return any(check_type(tgt, arg) for arg in args)
    if origin is Literal:
        return tgt in args
    if origin is type:
        return type(tgt) is type
    if not isinstance(tgt, origin):
        return False
    if (customcheck := custom_typechecks.get(origin, None)) is not None:
        return customcheck(tgt, args)
    if origin in (dict, defaultdict, Mapping):
        assert len(args) == 2
        keytype, valuetype = args
        return all(
            check_type(k, keytype) and check_type(v, valuetype) for k, v in tgt.items()
        )
    if origin is tuple:
        if Ellipsis in args:
            assert len(args) == 2
            valuetype, _ = args
            return all(check_type(v, valuetype) for v in tgt)
        try:
            return all(check_type(v, t) for v, t in zip(tgt, args, strict=True))
        except ValueError:
            return False
    if origin in (list, set, frozenset):
        assert len(args) == 1
        (valuetype,) = args
        return all(check_type(v, valuetype) for v in tgt)
    return False

def check_subclass(child_type: Any, parent_type: Any) -> bool:
    if parent_type is Any or parent_type is child_type:
        return True
    parent_origin = get_origin(parent_type)
    child_origin = get_origin(child_type)
    if parent_origin is None:
        return issubclass(child_origin or child_type, parent_type)
    parent_args = get_args(parent_type)
    child_args = get_args(child_type)
    if parent_origin is Union:
        if child_origin is Union:
            return all(
                any(check_subclass(ch, pa) for pa in parent_args) for ch in child_args
            )
        return any(check_subclass(child_type, parent_arg) for parent_arg in parent_args)
    if parent_origin is Literal:
        if child_origin is Literal:
            return set(child_args).issubset(parent_args)
        return False
    if child_origin is Literal:
        return all(isinstance(ch, parent_origin) for ch in child_args)
    if child_origin is None:
        return False
    if not issubclass(child_origin, parent_origin):
        return False
    try:
        for childarg, parentarg in zip(child_args, parent_args, strict=True):
            if not check_subclass(childarg, parentarg):
                return False
    except ValueError:
        return False
    return True
FelixTheC commented 2 years ago

Good question, I started it a while ago I should start a deeper refactoring in the near future.

I had trouble with mypy when using descriptors instead of a Type that was the only reason.

I think you want to use your own version so that I will close this issue??

hjalmarlucius commented 2 years ago

Yeah I'm a bit stuck on migrating as I've ventured too far on my own path but would ideally offload all of this eventually. Agree that there's definitely a trade-off required between annotations and descriptors that should be cleaned up in python core - i.e. the ability to create your own dataclass without tons of hacks.