microsoft / pylance-release

Documentation and issues for Pylance
Creative Commons Attribution 4.0 International
1.71k stars 767 forks source link

Add support for PEP 232 – Function Attributes #3777

Closed sksg closed 1 year ago

sksg commented 1 year ago

Add support for PEP 232 – Function Attributes

The PEP 232 allows for code like:

from __future__ import annotations  # Necessary to delay type evaluation

def test_func(arg: test_func.arg_class):
    # use arg.sub_arg1 and arg.sub_arg2
    # use test_func.static_variable
    pass

class test_func_arg_class:
    sub_arg1: int
    sub_arg2: bool

# PEP 232: allow function attributes
# This allows for:
test_func.arg_class = test_func_arg_class
test_func.static_variable= 'abc'

Type hints should act similar to the alternative code without using function attributes

class test_func_arg_class:
    sub_arg1: int
    sub_arg2: bool

test_func_static_variable= 'abc'

def test_func(arg: test_func_arg_class):
    # use arg.sub_arg1 and arg.sub_arg2
    # use test_func_static_variable
    pass

However, current the type hints for the arg and test_func.static_variable in test_func are Any in the case of using function attributes.

I would have expected the type hints between the two code samples to be the same.

erictraut commented 1 year ago

test_func.arg_class isn't a valid type annotation because it references a dynamic attribute on a function. It is effectively treated as Any. I can't tell what you are trying to do in this code, so I'm not able to offer any good suggestions for how to restructure your code. If you can provide more details, I may be able to provide more help.

If you enable the reportFunctionMemberAccess diagnostic rule, you will receive errors when you attempt to write to function attributes. To enable this rule within a single file, you can add a comment at the top of the file: # pyright: reportFunctionMemberAccess=true.

Pyright (the type checker upon which pylance is built) already supports PEP 232 via protocols. If you want to create a function that supports typed attributes, you can create a protocol class that describes the function's __call__ method as well as any other typed attributes.

Here's an example of how this might work using a function decorator and a protocol class.

from typing import Callable, ParamSpec, Protocol, TypeVar, cast

P = ParamSpec("P")
R = TypeVar("R", covariant=True)

class FuncWithAttrs(Protocol[P, R]):
    def __call__(*args: P.args, **kwargs: P.kwargs) -> R:
        ...
    arg1: int
    arg2: bool

def make_func_with_attrs(fn: Callable[P, R]) -> FuncWithAttrs[P, R]:
    return cast(FuncWithAttrs[P, R], fn)

@make_func_with_attrs
def test_func(a: int, b: str):
    ...

test_func.arg1 = 3
test_func.arg2 = True
sksg commented 1 year ago

Thanks @erictraut.

I think I see the problem now; pylance (pyright) does not use dynamic attribute type hints. I did not know that function attributes are classified as dynamic attributes, but that makes perfect sense. In that case, this feature request should be renamed to "support dynamic attributes", which I am guessing will not be accepted.

Thanks also for pointing me to Protocols. If I want multiple functions to have the same attributes, I can see this as the perfect solution. However, if I want multiple functions to have different attributes with different types, this solution requires a make_func_with_attrs for each individual function.

Let say we have the three cases using dynamic attributes:

from __future__ import annotations  # Necessary to delay type evaluation

class test_func1_arg_class:
    sub_arg1: int
    sub_arg2: bool

def test_func1(arg: test_func1.arg_class):
    # use arg.sub_arg1 and arg.sub_arg2
    # use test_func.static_variable
    pass

class test_func2_arg_class:
    sub_arg3: str
    sub_arg4: float

def test_func2(arg: test_func2.arg_class):
    # use arg.sub_arg3 and arg.sub_arg4
    # use test_func2.static_variable
    pass

class test_func3_arg_class:
    sub_arg5: dict
    sub_arg6: list

def test_func3(arg: test_func3.arg_class):
    # use arg.sub_arg5 and arg.sub_arg6
    # use test_func3.static_variable
    pass

test_func1.arg_class = test_func1_arg_class
test_func1.static_variable= 'abc'

test_func2.arg_class = test_func2_arg_class
test_func2.static_variable= {}

test_func3.arg_class = test_func3_arg_class
test_func3.static_variable= 123

The problem is then how to make a non-dynamic attribute version of these.

I suspect that a solution must exists similar to this

from __future__ import annotations  # Necessary to delay type evaluation
from unknown import generic_make_func_with_attrs

class test_func1_arg_class:
    sub_arg1: int
    sub_arg2: bool

@generic_make_func_with_attrs(
    arg_class=test_func1_arg_class,
    static_variable='abc'
)
def test_func1(arg: test_func1.arg_class):
    # use arg.sub_arg1 and arg.sub_arg2
    # use test_func.static_variable
    pass

class test_func2_arg_class:
    sub_arg3: str
    sub_arg4: float

@generic_make_func_with_attrs(
    arg_class=test_func1_arg_class,
    static_variable={}
)
def test_func2(arg: test_func2.arg_class):
    # use arg.sub_arg3 and arg.sub_arg4
    # use test_func2.static_variable
    pass

class test_func3_arg_class:
    sub_arg5: dict
    sub_arg6: list

@generic_make_func_with_attrs(
    arg_class=test_func1_arg_class,
    static_variable=123
)
def test_func3(arg: test_func3.arg_class):
    # use arg.sub_arg5 and arg.sub_arg6
    # use test_func3.static_variable
    pass

but the solution eludes me at the moment. The success criteria being that I get type hints for all of the function attributes.

Any help/ideas are greatly appreciated.

sksg commented 1 year ago

Technically, another solution would be to wrap it all in a class and make into a protocol. However, pyright does not support reassigning the type using decorators. So the following does not solve it either:

from __future__ import annotations  # Necessary to delay type evaluation

def make_protocol(cls):
    return cls()

@make_protocol
class test_func:
    class arg:
        sub_arg0: int
        sub_arg1: int

    def __call__(arg: test_func.arg):
        pass
erictraut commented 1 year ago

Type annotations in Python are not allowed to use dynamic expressions — expressions whose meaning cannot be evaluated at static analysis time. This makes sense because they are designed for static analysis. There is no such thing as "dynamic attribute type hints". I'm not even sure what that means in the context of static typing.

By default, pyright allows you to assign values to any function attribute, and no error will be generated. However, these attributes are not type checked. They are all treated as Any, which means there will be no completion suggestions for them. Pyright also offers you the ability to treat all such accesses as an error if you enable reportFunctionMemberAccess. This is important if you want your code to be type safe.

I still don't have enough information about why you are trying to write to attributes of functions. My general advice is to not do this. While Python permits this, it's not something I would ever do in my code. It's akin to writing values to attributes on an object when the object's associated class doesn't define that attribute. This is possible to do in a dynamic language like Python, but it's not a recommended practice.

Consider the following:

# In the first example, we allocate a list object and write to an attribute
# that is not defined by the `list` class. This will generate an error if you
# enable type checking. You will not have completion suggestions for
# this attribute.

my_list = list()
my_list.random_attribute = "hi!"

# If you want to add a new attribute to the `list` class, the proper
# way to do this is to subclass `list` and declare a `random_attribute`
# member in the subclass definition.
class MyList(list):
    random_attribute: str

# The next example is analogous to the previous one except that
# `my_func` is a function object rather than a `list` object. If you
# want to describe this functionality to a static type analyzer, you
# need to define a protocol class that describes the new members
# that you want to add to the function.

def my_func(): pass
my_func.random_attribute = "hi"

If you are intent on writing values to attributes of function objects and you want the accesses to be understood by a static type analyzer, then protocols are the way to go.

sksg commented 1 year ago

Thanks for your comments. They really help setting the "scope" of a type checker.

To put a comment of the first paragraph: Function attributes are not able to be added statically, because functions are defined once at the def. So I would have liked a workaround to add "static" attributes.

Generally speaking, I am trying to make a function object into a namespace in a type safe manor. More specifically, I have a number of functions each with a set of option-like arguments. Rather than using a generic kwargs I am using a pydantic dataclass, like so:

from pydantic import BaseModel

# This dataclass has builtin type validation
class function_options(BaseModel):
    number_of_steps: int
    tolerance: float
    method: str = "default"

# I may even use a global variable to pull out some general configuration:
function_allowed_methods = ["default", "fast", "safe"]

def function(a: list, options: function_options):
    # act on a using options
    ...

# This is the dynamic part which makes the function into a namespace
function.options = function_options
function.allowed_methods = function_allowed_methods 

Now, since this particular options class is very specific to this function, I would like to attach it to the function. Effectively, I want the function to serve (also) as a namespace, turning names like function_options into function.options. This can be done dynamically with function attributes but then I lose type hints.

I am beginning to realize that what I want may not be possible in a static fashion and so cannot be type checked.

erictraut commented 1 year ago

You could write a class that effectively "wraps" the function. It would accept a reference to the function in its constructor and act as a proxy for the function, calling through to it when invoked. This class could declare any additional typed attributes that you want. That design pattern would be much more typical (and better supported in the Python type system) than attempting to use a function object as a namespace.

You might also be interested in PEP 692, which is in draft form and already implemented in pyright.

sksg commented 1 year ago

Yes, I agree. I have tried a few examples without success on the type hints (I am still fresh in the type-game). But I will try again when I have time.

Thanks for the PEP reference. While it does not address the "function as a namespace", it is clearly a relevant idiom for function **kwargs. I cannot see if, for example, dataclasses or pydantic dataclasses are supported in the PEP, or if it is just TypedDicts. But it is clearly a great step in the right direction.

sksg commented 1 year ago

So, I tried as suggested. When used with delayed type evaluation, however, there is a discrepancy between the type hints of the arguments in the signature and other normal use. See this MWE:

from __future__ import annotations  # Necessary to delay type evaluation

class namespace_base:
    """Base class for any function namespace"""
    def __init__(self, function):
        self.function = function

    # I have not found a good way to pass on the type hints of the wrapped
    # function. So, for now it is untyped masking the underlying function. See
    # effect at the end.
    def __call__(self, *args, **kwds):
        return self.function(*args, **kwds)

class function_namespace(namespace_base):
    """Specific function namespace"""
    class options:
        sub_arg0: list
        sub_arg1: str

@function_namespace  # Effectively wraps the function in a namespace
def function_0(a: int, b: bool, opts: function_0.options):
    # Notice that the opts and function_0.options in the signature have correct
    # type hints. However, when used in the function body:
    opts # Wrong type hint: Any
    # Conversely, a variable defined in the body has the correct type hint.
    opts_alt = function_0.options()  # Correct type hint: options
    opts_alt.sub_arg0  # Correct type hint: list
    opts_alt.sub_arg1  # Correct type hint: str
    ...

@function_namespace
def function_1(a: int, b: bool, opts: function_namespace.options):
    # If using function_namespace.options instead the type hints are correctly
    # passed.
    opts # Correct type hint: options
    opts.sub_arg0  # Correct type hint: list
    opts.sub_arg1  # Correct type hint: str
    ...

# Variables defined outside also have correct type hints.
opts = function_0.options()  # Correct type hint: options
opts.sub_arg0  # Correct type hint: list
opts.sub_arg1  # Correct type hint: str

# Finally, the type hint here is destroyed by the __call__ definition in the
# namespace_base
function_0()  # Correct type hint: (*args, **kwds) --> Any

Is this expected behaviour?

erictraut commented 1 year ago

Using function_0.options as a type annotation won't work in this case because you have created a circular dependency that has no resolution. The signature for function_0 contains a parameter opts whose type is declared as function0.options. To evaluate this type expression, function0 must be evaluated. This symbol's type depends on the return type of the decorator function_namespace which takes as an input the undecorated function0 whose signature must be known to evaluate the constructor call in the decorator. But its signature refers to function0.options, and we're in a cycle. The only way to resolve it is to assume Any which is what pyright is doing here.

As you've found, you can avoid this by switching to the annotation function_namespace.options, which doesn't involve any circular dependency.

I must admit that I'm still very confused about your overall goal here. You said that you want to associate the options class with the function. Is that right? What good does it do to hang the class off the function object? Clearly, you won't be able to associate instances of the option class with the function because there is only one function object whereas there are presumably many instances of the options class, right? Or is there only one set of options for each function? If so, then why would you pass that set of options as an input parameter to the function? If there's only one instance of the options class for the function, then it doesn't need to be a parameter.

You mentioned that you haven't found a good way to preserve the function signature of the wrapped function. That's normally done with a ParamSpec.

_P = ParamSpec("_P")
_R = TypeVar("_R")

class namespace_base(Generic[_P, _R]):
    """Base class for any function namespace"""
    def __init__(self, function: Callable[_P, _R]):
        self.function = function

    def __call__(self, *args: _P.args, **kwds: _P.kwargs) -> _R:
        return self.function(*args, **kwds)
sksg commented 1 year ago

I see. I was hoping the from __future__ import annotations would avoid the circular dependency. This may well be the show stopper for this type of behavior.

In trying a few options, I stumbled on to this behavior:

from typing import Generic, ParamSpec, TypeVar, cast
from collections.abc import Callable

_P = ParamSpec("_P")
_R = TypeVar("_R", covariant=True)

def _wrapper_fn(function: Callable[_P, _R]) -> Callable[_P, _R]:
    """Wrapper function for any function"""
    def wrapper(*args: _P.args, **kwds: _P.kwargs) -> _R:
        """The function wrapper docstring"""
        return function(*args, **kwds)
    return cast(_wrapper_cls[_P, _R], wrapper)

class _wrapper_cls(Generic[_P, _R]):
    """Wrapper class for any function"""

    def __init__(self, function: Callable[_P, _R]):
        self.function = function

    def __call__(self, *args: _P.args, **kwds: _P.kwargs) -> _R:
        """The wrapper __call__ docstring"""
        return self.function(*args, **kwds)

    @staticmethod
    def attach(function: Callable[_P, _R]):
        """This static method is just for type casting to a function"""
        wrapper = _wrapper_cls(function)
        return cast(Callable[_P, _R], wrapper)

    attribute = 100  # This is meant as a global constant

@_wrapper_fn
def function_a(a: int, b: bool) -> str:
    """The function a docstring"""
    return "x" * a if b else "-"

@_wrapper_cls
def function_b(a: int, b: bool) -> str:
    """The function b docstring"""
    return "x" * a if b else "-"

@_wrapper_cls.attach
def function_c(a: int, b: bool) -> str:
    """The function c docstring"""
    return "x" * a if b else "-"

function_a
# Type hint:
# (function) function_a(a: int, b: bool) -> str
# The function a docstring
reveal_type(function_a)
# Type of "function_a" is "(a: int, b: bool) -> str"

function_b
# Type hint:
# (function) function_b_wrapper_cls[(a: int, b: bool), str]
reveal_type(function_b)
# Type of "function_b" is "_wrapper_cls[(a: int, b: bool), str]"

function_b.attribute
# Type hint:
# (variable) attribute: int
reveal_type(function_b.attribute)
# Type of "function_b.attribute" is "int"

function_c
# Type hint:
# (function) function_c(a: int, b: bool) -> str
# The function c docstring
reveal_type(function_c)
# Type of "function_c" is "(a: int, b: bool) -> str"

function_c.attribute
# Type hint:
# attribute: Any
reveal_type(function_c.attribute)
# Type of "function_c.attribute" is "Any"

Most of this is expected behavior since we are turning the function object into a class instance. However, I cannot seem to get Pylance to show the function docstring in the type hint of function_b. This would mean that Pylance/Pyright does not allow class decorators on function objects while preserving the function-like type hinting. Am I interpreting this correctly?

In addition, I would have like a different type hint ala: (function) function_b_wrapper_cls[(a: int, b: bool), str] --> (function) function_b(a: int, b: bool) -> str, though that is of less importance.