mypyc / mypyc

Compile type annotated Python to fast C extensions
1.75k stars 46 forks source link

[GSoC 2021] Support singledispatch efficiently #802

Closed JukkaL closed 3 years ago

JukkaL commented 3 years ago

This issue describes a Google Summer of Code 2021 project. We are doing this project!

Background

Mypyc is a compiler from type-annotated Python to CPython C extension modules. It uses type information and restrictions to certain dynamic language features to speed up code, often significantly. We use mypyc to compile mypy, resulting in up to 4x faster runtimes. We are in the process of making mypyc useful for a wide variety of projects, not just mypy. This project will help us get closer to this milestone, and it's also a good way to get familiar with mypy and mypyc development, since you'll have to update multiple phases of compilation.

Project description

Currently mypy and mypyc don't properly support @singledispatch (doc link). Some codebases use this decorator a lot. Since mypyc currently doesn't recognize this, the resulting code is quite slow. Mypy also generates false positives when using this decorator.

We've been working on fixin bottlenecks where mypyc can't speed up relatively straightforward code, and this is a good example of a bottleneck that is currently hard to deal with.

The goal of this project is to compile code that uses @singledispatch into efficient C. All overloaded implementations of a function should be compiled into a single function that performs dispatch using the equivalent of isinstance() operations. We wouldn't actually use the runtime implementation of @singledispatch in the generated code at all but special case the whole decorator implementation in the IR build phase of mypyc.

A secondary goal would be to modify mypy to type check code using @singledispatch properly. At the very least it shouldn't complain about valid uses of the decorator. Mypy should also construct a precise overloaded signature from all the implementation variants.

Helpful links

JukkaL commented 3 years ago

cc @msullivan

JukkaL commented 3 years ago

Ideas about starter issues are collected here: python/mypy#10195

JukkaL commented 3 years ago

@msullivan will be mentoring this project.

JukkaL commented 3 years ago

In case you are not sure what @singledispatch is used for, here's a real-world example of @singledispatch in the mypy codebase: https://github.com/python/mypy/blob/master/mypy/stubtest.py#L183

pranavrajpal commented 3 years ago

I'm interested in working on mypy for Google Summer of Code this year.

When you say that we're doing this project, does that mean that the other idea mentioned here (speeding up string operations in mypyc) isn't an option, or are both of these ideas still options for this year?

ChetanKhanna commented 3 years ago

When you say that we're doing this project, does that mean that the other idea mentioned here (speeding up string operations in mypyc) isn't an option, or are both of these ideas still options for this year?

Hello @pranavrajpal , both the ideas are open for applications for this year.

TH3CHARLie commented 3 years ago

In case people are missing Gitter discussions, it's generally a good idea to share your gsoc proposals with mentors before finally submitting them. You can share here or you can email us if you don't want other participants to copy your brilliant ideas.

ChetanKhanna commented 3 years ago

Hello! Sorry I got a bit late had to travel urgently and I mistook the deadline to be 29th. To get things started, I had a couple of questions regarding the first part of the project:

  1. We aim at inlining the singledispatch decorator by generating IR for it, correct?
  2. Do we also need to look in the direction of improving the IR generation for decorators in general, given in irbuild.function.py? Or we can possibly use it while handling singledispatch?

Sorry if my questions aren't very to the point, I am just trying to collect pointers and ensure I'm thinking in the correct direction.

JukkaL commented 3 years ago

We aim at inlining the singledispatch decorator by generating IR for it, correct?

Yes. It's also fine to define additional C helpers that are called in the generated IR, if that makes the implementation easier or cleaner. For example, we could generate a separate top-level function that inspects the argument type and dispatches to the correct variant. Each variant would be implemented as a separate compiled function. These variants would be internal, and impossible to call directly. I'll prepare a example.

Do we also need to look in the direction of improving the IR generation for decorators in general

It's sufficient that we can special case only singledispatch. If the implementation can be generalized to other use cases, that's a plus, but not essential.

JukkaL commented 3 years ago

These variants would be internal, and impossible to call directly. I'll prepare a example.

Actually, if each variant has a unique name, it should be possible to call them directly. However, if they use a dummy name such as _ they should be internal.

Here's a simple example using singledispatch:

from functools import singledispatch

class Expr: pass

class Int(Expr):
    def __init__(self, val: int) -> None:
        self.val = val

class Add(Expr):
    def __init__(self, x: Expr, y: Expr) -> None:
        self.x = x
        self.y = y

@singledispatch
def eval(arg: Expr) -> int:
    raise NotImplementedError

@eval.register
def eval_int(arg: Int) -> int:
    return arg.val

@eval.register
def eval_add(arg: Add) -> int:
    return eval(arg.x) + eval(arg.y)

print(eval(Int(5)))  # 5
print(eval(Add(Int(3), Int(4))))  # 7
print(eval_int(Int(33)))  # 33

Now the implementation could compile the eval function and the variants to IR that roughly corresponds to this code:

def eval(arg: Expr) -> int:  # <-- this function is generated in mypyc (and doesn't need the decorator)
    if isinstance(arg, Int):
        return eval_int(arg)
    elif isinstance(arg, Add):
        return eval_add(arg)
    else:
        return _eval_impl(arg)

def _eval_impl(arg: Expr) -> int:
    raise NotImplementedError

def eval_int(arg: Int) -> int:
    return arg.val

def eval_add(arg: Add) -> int:
    return eval(arg.x) + eval(arg.y)

The plan could also mention some follow-up tasks:

If this doesn't feel like enough work for the summer, generalizing the approach to support contextlib.contextmanager would be a useful follow-up task. We could probably inline contextmanager similarly for better performance.

ChetanKhanna commented 3 years ago

Hello @msullivan @JukkaL. This is the idea part of my proposal for this project. I need a little more work on the timeline part and will post the complete proposal here as soon as I am done with it as well as send out an email. The clarification @JukkaL provided above were insightful, thanks a lot for them :) Please let me know how I can improve it further:

Approach/Idea

Currently mypyc splits the @singledispatch ed function and all the corresponding variants into separate functions. The idea will be to redo this compilation of the main function so that it acts as a dispatcher to all the other variants using isinstance operations.

Here's the IR that mypyc currently generates for the following snippet:

@singledispatch
def f(x: Any) -> Any:
    return x

@f.register
def g(x: float) -> float:
    return x+2.5

@f.register
def h(x: str) -> str:
    return 'New string: ' + x

The generated IR:

def __mypyc_f_decorator_helper__(x):
    x :: object
L0:
    return x
def __mypyc_g_decorator_helper__(x):
    x, r0 :: float
    r1 :: object
    r2 :: float
L0:
    r0 = 2.5
    r1 = PyNumber_Add(x, r0)
    r2 = cast(float, r1)
    return r2
def __mypyc_h_decorator_helper__(x):
    x, r0, r1 :: str
L0:
    r0 = 'New string:
    r1 = PyUnicode_Concat(r0, x)
    return r1

My goal will be to modify the main function (__mypyc_f_decorator_helper__ in this case) so that it can check the type of argument x and call other compiled variants.

For now, I tried hacking into the transform_decorator function to inline the IR (without using external C functions). My idea was to extract the decorator information from builder.fdefs_to_decorators and use it to trigger all functions that are mapped to MemberExpr nodes having expr.name same as main function (for this example f). I'll try my best to set up a POC by the time I submit my final proposal.

Although I believe that this might get messy as we move towards handling more complex cases. We could instead define a C helper function that can be called from the generated IR and could help with calling the variants. There are pointers for this implementation in the guide: https://github.com/python/mypy/blob/master/mypyc/doc/dev-intro.md#adding-c-helpers

I will want to explore both direct inlining of IR and writing a C function that can be called from the IR to help in dispatching before the actual implementation.

I can then extend the implementation. I have the following three extensions in mind currently :-

  1. Support arguments inside register decorator
  2. Support variants with same name. For this though, we will first need to enable mypy to correctly type check these. Currently it will complain if it encounters functions with same name decorated with register.
  3. Support singledispatchmethod decorator for methods.

Benchmarking

As suggested in the above comment, we will be needing a benchmark to measure the performance improvements over time. I'll go through the ones already implemented here: https://github.com/mypyc/mypyc-benchmarks for reference.

Type checking by mypy

As mentioned, mypy currently does not support dummy variants with same names. Even the most common one _ which is very frequently used with singledispatch. There is an open issue related to this: https://github.com/python/mypy/issues/2904

It is suggested to create a plugin to special-case this. There is good amount of documentation for plugin with mypy: https://mypy.readthedocs.io/en/stable/extending_mypy.html https://github.com/python/mypy/tree/master/mypy/plugins

as well as on the linked issues and PRs which I can refer to during implementation.

ChetanKhanna commented 3 years ago

Hello! Can someone elaborate on the following line from the project description?

Mypy should also construct a precise overloaded signature from all the implementation variants.

JelleZijlstra commented 3 years ago

@ChetanKhanna I think this is referring to the signature used for type checking when the callable is (for example) passed to another function as an argument.

For example, given Jukka's eval example above, mypy should correctly check whether:

def f(func: Callable[[Int], object]) -> None: ...

f(eval)

is legal. There is already code doing this for overloaded functions using @overload, and for singledispatch we should have something constructing the same kind of signature mypy already infers for overloaded functions.

ChetanKhanna commented 3 years ago

Okay, I think I get it. So something like this should fail on the eval example above, but it doesn't. While a similar construct of eval using overload instead of singledispatch will be caught by mypy:

# should give an error
def test(func: Callable[[str], int]) -> None: ...

test(eval)
ChetanKhanna commented 3 years ago

Also, I have finished the timeline section of my proposal as well. It's here: https://gist.github.com/ChetanKhanna/83e71abd6790922a9533a887d2bc999a

But now I think I need to at least redo the example and add the above discussed point to the proposal :thinking:

JelleZijlstra commented 3 years ago

Thinking about it more, actually perhaps the more important thing to get right is that the precise types of different implementations are preserved. So in your example with f, mypy should infer that f(float()) is of type float, f(str() is str, and f(object()) is Any (from the fallback implementation).

JukkaL commented 3 years ago

We can use mypy.types.Overloaded as the type of a singledispatch function to precisely track the different variants. The fallback implementation would be given as the last item in the overload, since the first matching overload is generally picked by mypy. The fallback implementation should have the most general type (at least normally).

ChetanKhanna commented 3 years ago

We can use mypy.types.Overloaded as the type of a singledispatch function to precisely track the different variants. The fallback implementation would be given as the last item in the overload, since the first matching overload is generally picked by mypy. The fallback implementation should have the most general type (at least normally).

I was pondering over this. From what I understand, the binding of function signatures and type is done during semantic analysis in semanal.py. Does it make sense to handle the singledispatch case there as well? Otherwise where else could we possibly implement it?

JukkaL commented 3 years ago

Otherwise where else could we possibly implement it?

Yeah, semanal.py is the the right place.

msullivan commented 3 years ago

Mypy should also construct a precise overloaded signature from all the implementation variants.

We probably can't do this in incremental mode.

JukkaL commented 3 years ago

We probably can't do this in incremental mode.

Hmm can you elaborate a little? It would be a shame if this only worked in non-incremental mode, since I'd like us to switch to using incremental mode by default in mypyc.

msullivan commented 3 years ago

The problem with doing it in incremental mode is that adding a new call to register could change the type signature of a function declared in a module that is earlier in the dependency order, which could be used in other places earlier in the dependency order.

I was thinking that for mypyc, we would require that all registered implementations be in the same compilation group. So it would work in incremental mode, but would need to all be in one SCC, which is a limitation we can't reasonably impose when just type checking.

We could potentially also do a fallback scheme, where implementations in the same group get compiled into an accelerated dispatch, but we fall back to the normal singledispatch mechanisms for things outside the group?

JukkaL commented 3 years ago

So it would work in incremental mode, but would need to all be in one SCC, which is a limitation we can't reasonably impose when just type checking.

I think that we also can't reasonable add a feature that doesn't work in incremental mode. Even in non-incremental mode, I don't see an easy way to support this in full generality. If a imports b and adds a new variant to a singledispatch function defined in b, another module c that imports b does not necessarily see this new variant, since it could be fully processed before a.

What if we allow registering additional variants outside the original SCC (when type checking), but these don't get reflected in the type signature? I.e., they would fall back to the most general signature from an external point of view. We'd only check that the signature is compatible with the "top-level" signature.

Another, more complicated, option would be to keep track of extensions to singledispatch functions on a per-module basis. If module a registers a new variant to b.f, then the variant would only be visible in a and any module that depends on a. This would still be problematic. Two modules could add variants to b.f, and depending on processing order, the order of the variants could differ, and thus we could see different results based on which order modules are processed, which is spooky action at a distance. The order of variants in an overloaded callable type matters. We could plausible work around this by requiring that any variants registered outside the original definition must be non-overlapping.

We could potentially also do a fallback scheme, where implementations in the same group get compiled into an accelerated dispatch, but we fall back to the normal singledispatch mechanisms for things outside the group?

This sounds like a reasonable compromise to me.

I wonder how often do additional implementations get registered outside the original module. Maybe it would be useful to do some analysis of these use cases. If it's rare enough, we could decide to not support this use case.

sobolevn commented 3 years ago

My take on typing singledispatch: https://github.com/dry-python/classes It might be helpful (at least you can learn from my mistakes!) 😆

Useful links:

msullivan commented 3 years ago

I wonder how often do additional implementations get registered outside the original module. Maybe it would be useful to do some analysis of these use cases. If it's rare enough, we could decide to not support this use case.

At EdgeDB, we use singledispatch 19 times. Of that:

pranavrajpal commented 3 years ago

We could potentially also do a fallback scheme, where implementations in the same group get compiled into an accelerated dispatch, but we fall back to the normal singledispatch mechanisms for things outside the group?

We probably can't reasonably support having some implementations registered dynamically while having other implementations called directly using a compiled dispatch function.

The problem is that the order that we should put the isinstance checks against each of the dispatch types is dependent on what those dispatch types are. So if you have a compiled file like this (called compiled.py):

from functools import singledispatch

class A: pass
class B(A): pass
class C(B): pass

@singledispatch
def f(arg) -> str:
    return 'default'

@f.register
def _(arg: B) -> str:
    return 'b'

And an interpreted file like one of these:

Case 1

from compiled import f, C

@f.register
def _(arg: C) -> str:
    return 'c'

assert f(C()) == 'c'

Case 2

from compiled import f, A, B

@f.register
def _(arg: A) -> str:
    return 'a'

assert f(B()) == 'b'

If we check all of the dynamically registered implementations first, see if any match, and then call the fallback implementation if none of the others work (what we do currently), Case 1 works but Case 2 breaks. If we check the compiled implementations first and then check any dynamically registered implementations, Case 2 works but Case 1 breaks.

The only way that I can think of to get both to work is to, at runtime, look at all of the types in f.registry.keys() and check where each of them are on the argument's type's MRO and/or check whether any of them are subclasses or superclasses of any of the known dispatch types. We might be able to get something like that to work, but we'd probably end up with something very similar to the standard library implementation of singledispatch, and all the added checks would probably mean we'd lose most of the possible performance gain.

The best way to support this would probably be to not support registering functions dynamically when compiling singledispatch functions, and to add a mypyc_attr option to not compile singledispatch functions for people to use when they want other code to be able to register functions dynamically. We could probably do that by:

msullivan commented 3 years ago

I realized another major problem with the isinstance chain approach, which is that it fundamentally does not work in the presence of multiple inheritance. Consider the following:

from mypy_extensions import trait
from functools import singledispatch

@trait
class A: pass
@trait
class B: pass
class AB(A, B): pass
class BA(B, A): pass

@singledispatch
def f(arg) -> None:
    pass

@f.register
def fa(arg: A) -> None:
    print("a")

@f.register
def fb(arg: B) -> None:
    print("b")

f(AB())  # a
f(BA())  # b

The issue is that both AB and BA are subtypes of both A and B, but which method to call needs to be decided based on MRO order. It can't be figured out by an isinstance chain.

This is frustrating, because this case seems super marginal! I could be convinced to say we'll try to prohibit it, if I could think of a reasonable enough way to do it.

I have another idea for an implementation strategy, though:

The downside here is that we need to do a dict lookup even in the happy case, but that doesn't seem too bad.

JukkaL commented 3 years ago

@msullivan I like this implementation strategy. If we feel that the dictionary lookup is too slow, we could later switch to a specialized dictionary implemented in C (from type object to short int/callable).

JukkaL commented 3 years ago

This project was a success! Thank you @pranavrajpal. (It ended around end of August actually.)