python-attrs / attrs

Python Classes Without Boilerplate
https://www.attrs.org/
MIT License
5.3k stars 373 forks source link

Support for automatic runtime type-checking #301

Open anthrotype opened 6 years ago

anthrotype commented 6 years ago

This has been proposed and discussed in https://github.com/python-attrs/attrs/issues/215, as a possible use case for the newly added type argument to attr.ib() #239

quoting @hynek https://github.com/python-attrs/attrs/issues/215#issuecomment-347529479

a good first step would be to add generic class-wide validators (@attr.s(validator=whatever)) and then make type checking a special case of it, possibly with some syntactic sugar.

aldanor commented 6 years ago

Ok, as a continuation of previous discussion in #215... :)

@glyph I was trying to find https://github.com/aldanor/typo today but instead I ran across https://github.com/RussBaz/enforce which looks like it may be a fairly full-featured version of the same thing.

@chadrik There is also https://github.com/Stewori/pytypes May the best project win.

Just to clarify, the sole reason I've started one of my own was that all existing solutions (those including enforce and pytypes) were (a) slow and (b) wrong -- although both are very good attempts and good inspiration. That being said, my version is not fully type-correct either when it comes to sum types (see examples below), but 'less wrong' if I may; on the bright side, it's fast. I haven't spent any time on finishing it due to lack of motivation and time lately, but it could be done, maybe with some help. If anyone knows any other comparable or relevant projects - shout away, I'd personally be very interested.

TL;DR: it's hard to write a runtime type checker that's both fast and correct, especially if it aims to handle both typevars and sum types; although not impossible (I'm not sure one already exists at this moment however). Details below.


# pip install git+https://github.com/RussBaz/enforce.git
import enforce
# pip install git+https://github.com/aldanor/typo.git  (3.5 only; needs a few fixes)
import typo
# pip install git+https://github.com/Stewori/pytypes.git
import pytypes  
# pip install # git+https://github.com/agronholm/typeguard.git
import typeguard

Simple example:

def simple(x: int, y: str): pass

simple_pytypes = pytypes.typechecked(simple)
simple_enforce = enforce.runtime_validation(simple)
simple_typeguard = typeguard.typechecked(simple)
simple_typo = typo.type_check(simple)

args = 1, 'foo'
%timeit -r7 -n1000 simple_pytypes(*args)
%timeit -r7 -n1000 simple_enforce(*args)
%timeit -r7 -n1000 simple_typeguard(*args)
%timeit -r7 -n1000 simple_typo(*args)
314 µs ± 23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
155 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
51.6 µs ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
497 ns ± 3.13 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

All three work correctly; typeguard is 100x slower, enforce 300x, pytypes 600x.


Slightly more involved:

from typing import List, Union, Dict, Tuple

def nested(x: Dict[Tuple[int, bytes], List[Union[str, float]]]) -> int: return 1

nested_pytypes = pytypes.typechecked(nested)
nested_enforce = enforce.runtime_validation(nested)
nested_typeguard = typeguard.typechecked(nested)
nested_typo = typo.type_check(nested)

x = {(1, b'3'): ['a', 1., 'b', 3.], (3, b'1'): [], (4, b'5'): ['c', 3.14]}
# %timeit -r7 -n1000 nested_pytypes(x)  # FAILS
%timeit -r7 -n1000 nested_enforce(x)
%timeit -r7 -n1000 nested_typeguard(x)
%timeit -r7 -n1000 nested_typo(x)
2.08 ms ± 23.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
176 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.57 µs ± 169 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Three out of four work -- pytypes fails, typeguard is 40x slower, enforce 450x.


Simple generic example (with a catch):

from typing import TypeVar

A, B = TypeVar('A'), TypeVar('B')

def generic1(x: List[Union[A, int]], y: A): pass

generic1_pytypes = pytypes.typechecked(generic1)
generic1_enforce = enforce.runtime_validation(generic1)
generic1_typeguard = typeguard.typechecked(generic1)
generic1_typo = typo.type_check(generic1)

args = [1], 'b'  # valid signature (A=str)
# %timeit -r7 -n1000 generic1_pytypes(*args)  # FAILS
# %timeit -r7 -n1000 generic1_enforce(*args)  # FAILS
# %timeit -r7 -n1000 generic1_typeguard(*args)  # FAILS
%timeit -r7 -n1000 generic1_typo(*args)
4.19 µs ± 206 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Three out of four fail.


And finally...

def generic2(x: Union[A, B], y: A, z: B): pass

generic2_pytypes = pytypes.typechecked(generic2)
generic2_enforce = enforce.runtime_validation(generic2)
generic2_typeguard = typeguard.typechecked(generic2)
generic2_typo = typo.type_check(generic2)

args = 'a', 3, 'b'  # valid signature (A=int, B=str)
# %timeit -r7 -n1000 generic2_pytypes(*args)  # FAILS
# %timeit -r7 -n1000 generic2_enforce(*args)  # FAILS
# %timeit -r7 -n1000 generic2_typeguard(*args)  # FAILS
# %timeit -r7 -n1000 generic2_typo(*args)  # FAILS

All four fail, amen.

agronholm commented 6 years ago

I couldn't help but to notice the absence of my library (typeguard) which predates both pytypes and typo (and which pytypes has borrowed much of its code from).

agronholm commented 6 years ago

I just read through the scoping rules in PEP 484 and it certainly did not cover cases like this. How is a type checker supposed to bind the type variables when the first occurrence is in a Union?

aldanor commented 6 years ago

How is a type checker supposed to bind the type variables when the first occurrence is in a Union?

Not make final conclusions based on the first occurence?..

agronholm commented 6 years ago

How, then, is it exactly supposed to make conclusions? This part was not explained in PEP 484.

aldanor commented 6 years ago

@agronholm I couldn't help but to notice the absence of my library (typeguard) which predates both pytypes and typo (and which pytypes has borrowed much of its code from).

Apologies -- I now remember your library, it's actually the fastest of all three :)

I've added typeguard tests to the examples above.

aldanor commented 6 years ago

How, then, is it exactly supposed to make conclusions? This part was not explained in PEP 484.

My intuition with signature like (Union[A, B], y: A, z: B) and the input (str, int, str) would be like this:

You could slightly optimize it by first resolving non-sum-types (although it will not magically help in all case; it can just make most of them faster):

This is kind of what typo tries to do, but there's still quite a bit of work; and there's some limitations.

If you resolve sum-types based on first occurence, this basically implies that Union[A, B] is not resolved the same way as Union[B, A] which doesn't make much sense.

Stewori commented 6 years ago

So far I miss tests here that scope the case that type information in hosted in stubfiles, which are clearly and officially part of PEP 484 specification. Also, no tests involving OOP constructs - methods, classmethods, staticmethods and properties are shown, not yet speaking of inner classes. @aldanor I recommend to file issues for encountered failures in the respective projects. Only this way issues can be solved.

These tests seem to scope rather much on performance, which is for typechecking a secondary design goal at best. typechecking should be disabled outside of testing and debugging phase.

Tinche commented 6 years ago

I'm pretty sure attrs can support this now, with no extra features. I'm willing to lend a hand to any author of a typechecking library to integrate into attrs.

Stewori commented 6 years ago

The typechecker should be kept exchangeable as no framework (for runtime typechecking) gets everything right yet. The fact that the typing module changes heavily from Python version to Python version makes it very challenging to keep up. E.g. Python 3.7 breaks everything again and I wasn't yet able to fix this for pytypes. Unfortunately this distracts from fixing the other issues.

euresti commented 6 years ago

If I can add another wrench to this. Remember that issue with resolving the types that have strings in them? #265 This would be necessary for any kind of automatic type checking.

Stewori commented 6 years ago

pytypes can resolve these strings/forward references. The case that such strings occur deeper within a type was supported only a while ago and no release was filed since then. See https://github.com/Stewori/pytypes/issues/22. pytypes also provides a service function pytypes.resolve_fw_decl that resolves forward references from a string or nested somewhere inside a type. Recursion proof.

hynek commented 5 years ago

There seems to be a new option for runtime type checks y'all: https://attrs-strict.readthedocs.io/en/latest/

pwwang commented 4 years ago

Or maybe let it run in setter: https://github.com/pwwang/attr_property ?

ghost commented 4 years ago

There seems to be a new option for runtime type checks y'all: https://attrs-strict.readthedocs.io/en/latest/

Would it be possible to merge this, or are there licensing (or other) concerns?

hynek commented 4 years ago

As it stands, don't see for us a reason to merge it, especially because it would mean that we'd have to maintain it too. Currently not looking for more maintenance burden. 🙃 We try to put our energy into making an ecosystem thrive, writing everything ourselves is unrealistic alas.

ghost commented 4 years ago

Thank you for your response. I absolutely understand where you're coming from here.