beartype / plum

Multiple dispatch in Python
https://beartype.github.io/plum
MIT License
539 stars 24 forks source link

Nondeterminism (including sometimes crashing) when using generics #109

Open patrick-kidger opened 1 year ago

patrick-kidger commented 1 year ago
import plum

@plum.dispatch
def f(x: list[int]): print("int!")

@plum.dispatch
def f(x: list[str]): print("str!")

@plum.dispatch
def f(x: list): print("other!")

f([1, "hi"])

rerunning this script will variably print out any one of the three branches, or sometimes crash with an AmbiguousLookupError.

plum==2.2.2; beartype==0.16.2

wesselb commented 1 year ago

Hey @patrick-kidger!

Beartype's default strategy for checking whether a list is a list[int] is by randomly choosing an element and checking the type of that element. This enables Beartype to run in O(1). See e.g. this part of the Beartype docs. There is already configuration available to disable randomly choosing element, but checking everything carefully, which Plum already opts into, but this Beartype strategy is not yet implemented.

Now, if at runtime you're using lists consisting only of one element type, this works fine. However, if your lists contain multiple types of elements, like in your example, you can run into problems. The simplest way to fix this is to use tuple[int, ...] instead of list, which will check every element type.

Reflecting on this, this should definitely be better documented somewhere.

patrick-kidger commented 1 year ago

Yup, I'm actually familiar with the underlying reason! I wanted to highlight the silent downstream issue this is causing for plum.

I don't think that documentation alone is enough. Until this is resolved in beartype, I'd suggest raising a warning or an error for this path instead.

Is it possible to substitute out which runtime type checker is used in plum? I'm contemplating writing my own typechecker -- giving slow-but-correct O(n) checks on at least the types I care about -- and it'd be great to be able to use plum with it.

wesselb commented 1 year ago

Right! I figured you'd be familiar with what's going on. :)

Substituting the runtime type checker should be possible. Plum fundamentally only depends on implementations of isinstance and issubclass, which are currently provided by is_bearable and TypeHint respectively, but which could be substituted for other implementations too.

Do you have a suggestion for an interface? What about something like the following?

dispatch = Dispatcher(type_checker=(isinstance, issubclass))
patrick-kidger commented 1 year ago

I think I like that API. I think you need at least one more piece though, which is type, in the sense of cls = type(instance). (And possibly a handful of other operations I'm not thinking about?)

Hypothetically we could imagine defining a custom type that operates faithfully, so e.g. it would satisfy my_type([1, "hi"]) == list[int | str]. In the limit I'm imagining implementing a whole-new (better) type system -- in practice I imagine no-one has the energy for that, but I think it's a useful ideal to target for API design.

wesselb commented 1 year ago

Ah, that’s a super interesting proposal. Perhaps a class/dataclass TypeSystem is appropriate here which implements these functions (isinstance, issubclass, type).

If you manage a faithful implementation of type, then that would make isinstance redundant, so that would be convenient.

Alright, I’m convinced! I’m not entirely sure when this will be finished, since it depends on how much spare time I have, but it shouldn’t be too long from now. :)