python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.59k stars 233 forks source link

AnyOf - Union for return types #566

Open srittau opened 6 years ago

srittau commented 6 years ago

Sometimes a function or method can return one of several types, depending on the passed in arguments or external factors. Best example is probably open(), but there are other examples in the standard library, like shutil.copy()[1]. In many of those cases, the caller knows what return type to expect.

Currently there are several options, but none of them is really satisfactory:

Therefore, I propose to add another type, for example AnyOf[...] that acts like Union, but can be used everywhere any of its type arguments could be used.

from datetime import date
x: AnyOf[str, date] = ...
s: str = x  # ok
dt: date = x  # ok
i: int = x  # type error
u: Union[str, bytes] = x  # ok
x = u  # type error (although the type checker could do something smart here and infer that u can only be str here)

I also think that the documentation should make it clear that using AnyOf is a code smell.

[1] Currently the type behaviour in shutil is broken in my opinion, but that does not change the fact that currently it is as it is.

srittau commented 6 years ago

Another use case - at least util we get a more flexible Callable syntax - is for optional arguments in callback functions, like WSGI's start_response():

StartResponse = AnyOf[
    Callable[[str, List[(str, str)]], None],
    Callable[[str, List[(str, str)], ExcInfo], None]
]

def create_start_response() -> StartResponse:
    ...

create_start_response()("200 OK", [])

Using Union this causes a type error. (Too few arguments.)

gvanrossum commented 6 years ago

This would be as unsafe as Any though right? E.g. def foo() -> Union[int, str] -- we have no idea whether foo() + 1 is safe or not. Sure, it can tell you that foo().append(1) is definitely wrong, but that's pretty minor.

Similarly we won't know if create_start_response()("200 OK", [], sys.exc_info()) will be accepted or not. If you meant to describe a situation where it returns a callback that can be called with or without the extra argument, there's already a solution: https://mypy.readthedocs.io/en/latest/additional_features.html?highlight=mypy_extensions#extended-callable-types.

srittau commented 6 years ago

Good to know that there is a proper solution for the callback case!

Personally, I think the improvement in type safety over just returning Any would be worth it. It surely can't catch all problematic cases, but some is better than none. And the issue seems to crop up from time to time, for example just after I opened this in python/typeshed#2271. That the problem of returning Unions is also explicitly mentioned in the docs is also noteworthy, I think.

ilevkivskyi commented 6 years ago

In my experience I never had a situation where I needed unsafe unions. Anyway, I could imagine some people might want it. However, the problem is that the benefits of unsafe unions are incomparable with the amount of work to implement them. Adding a new kind of types to mypy e.g. could take months of intense work. This one will be as hard as introducing intersection types, and while the later are more powerful (they cover most of use cases of unsafe unions) we still hesitate to start working on it.

JukkaL commented 6 years ago

I'm with Ivan, and this is something we've considered earlier -- see the discussion at https://github.com/python/mypy/issues/1693, for example. The relatively minor benefits don't really seem worth the extra implementation work and complexity.

This would only be a potentially good idea for legacy APIs that can't be typed precisely right now, and the most popular of those can be special cased by tools (e.g. through mypy plugins). Mypy already special cases open and a few other stdlib functions. Alternatively, we might be able to use some other type system improvements, such as literal types, once they become available. For new code and new APIs the recommended way is to avoid signatures that would require the use of AnyOf anyway.

Ad-hoc extensions have the benefit of being easy to implement. They are also modular, don't complicate the rest of the type system, and they potentially allow inferring precise return types.

There is also often a simple workaround -- write a wrapper function around the library function with an Any return that has a precise return type, by restricting the arguments so that the return type can be known statically. Example:

def open_text(path: str, mode: str) -> TextIO:
    assert 'b' not in mode
    return open(path, mode)

def open_binary(path: str, mode: str) -> BinaryIO:
    assert 'b' not in mode
    return open(path, mode + 'b')
JukkaL commented 5 years ago

Some use cases (such as open) can now be supported pretty well by using overloads and literal types (PEP 586).

hauntsaninja commented 4 years ago

This continues to crop up with typeshed pretty regularly. For a lot of return types, typeshed either has to either make a pretty opinionated decision or completely forgo type safety with Any.

One use case I find pretty compelling is for autocomplete in IDEs, eg: https://github.com/python/typeshed/issues/4511 I seem to recall PyCharm used unsafe unions and I'd imagine this is a big reason why.

From a typeshed perspective, it would be nice to support these use cases. From a mypy perspective, I agree it's maybe not worth the effort, so maybe type checkers could interpret a potential AnyOf as a plain Any.

gvanrossum commented 4 years ago

Maybe you could bring this up on typing-sig? A proto-pep might get support there.

Or maybe you can spell this using -> Annotated[Any, T1, T2] where T1, T2 are the possible return types? Then type checkers will treat it as Any but other tools could in theory interpret this as AnyOf[T1, T2]. Or is that too hacky?

srittau commented 4 years ago

I brought this up on typing-sig.

JelleZijlstra commented 3 years ago

Semantically, would AnyOf be equivalent to Intersection as proposed in #213? My intuition is yes: an operation on an AnyOf type should be valid if it is valid on at least one of the component types.

hauntsaninja commented 3 years ago

Guido's thoughts on the subject: https://mail.python.org/archives/list/typing-sig@python.org/message/TTPVTIKZ6BFVWZBUYR2FN2SPGB63Z7PH/ edited out misleading tldr

There's probably also some slightly different behaviour when intersecting slightly incompatible types. E.g., for an intersection maybe you'd want to treat intersection order like an MRO, but for AnyOf you'd probably want "is compatible with any of the intersection"

JelleZijlstra commented 3 years ago

I see, thanks for reminding me of that email! I suppose this matters when you're implementing a function with an AnyOf return type. In typeshed we could just write:

def some_func() -> Intersection[bytes, str]: ...

And it would work as expected.

But when implementing it, you'd write:

def some_func() -> Intersection[bytes, str]:
    if something:
        return bytes()
    else:
        return str()

And a type checker would flag the return type as incompatible. So in Guido's terminology, AnyOf would have to behave like Union in a "receiving" position and like Intersection in a "giving" position.

srittau commented 3 years ago

Maybe I'm misunderstanding how intersections are supposed to work, but to me an intersection type is a type that fulfills the protocol of multiple other types. At least that's how e.g. typescript and the Ivan's initial comment in #213 describe it. Intersection[bytes, str] wouldn't make much sense to me, because it would mean that the returned type is both a str and a bytes. An intersection lets you "compose" multiple types into one, which is why I like the Foo & Bar syntax for it (also like typescript and in comparison to | for union).

srittau commented 3 years ago

And that means that AnyOf has not much relation to intersections. Like Union, it's more meant to be an "either/or" situation. For example, the following would work with AnyOf, but not with Union (which why AnyOf is unsafe, but still much safer than Any):

def foo(b: bytes): ...

x: AnyOf[str, bytes]
y: str | bytes
foo(x)  # ok
foo(y)  # error
srittau commented 3 years ago

And sorry for the spam, but one last thought:

For the caller of a function, there is no difference, whether an argument is annotated with AnyOf of Union. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.

gvanrossum commented 3 years ago

For the caller of a function, there is no difference, whether an argument is annotated with AnyOf of Union. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.

It would be for the benefit of the callee. Conversely, for the callee there's no reason to return an AnyOf, since for them a Union works as well.

Taking your example, the connection between AnyOf and Intersection is that if we had

def foo(b: bytes): ...

x: str & bytes
foo(x)

would work as well. But presumably, to give x a value, you'd want to work either of these:

x: AnyOf[str, bytes]
x = ""  # ok
x = b""  # also ok

And there it behaves like Union. Combining this, we can have:

def foo(b: bytes): ...
x: AnyOf[str, bytes]

# This works:
x = b""
foo(x)

# This works too (i.e. doesn't cause a static type error):
x = ""
foo(x)

At this point I would just start repeating what I said in that email, so I'll stop here.

srittau commented 3 years ago

So to recap. For argument types:

# For the callee (receiver), AnyOf and Intersection are equivalent:
def foo1(x: AnyOf[str, StringIO]):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
def foo2(x: str & StringIO):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
# But Union isn't:
def foo3(x: str | StringIO):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)  # error

# But for the caller (giver) AnyOf and Union are equivalent:
foo1("")  # AnyOf ok
foo2("")  # Intersection error
foo3("")  # Union ok

For return types:

# For the callee (giver), AnyOf and Union are equivalent:
def foo1() -> AnyOf[str, bytes]:
    return ""  # ok
def foo2() -> str & bytes:
    return ""  # error
def foo3() -> str | bytes:
    return ""  # ok

# For the caller (receiver), AnyOf and Intersection are equivalent:
x1: str = foo1("")  # AnyOf ok
x2: str = foo2("")  # Intersection ok
x3: str = foo3("")  # Union error

Which means that in stubs (where there's no callee), Union and Intersection are sufficient, but AnyOf would still be needed for implementations. (Jelle's point, I think.)

Akuli commented 3 years ago

I don't understand why def foo() -> str & bytes should be any different from def foo() -> NoReturn. For -> str & bytes, a type checker could deduce that because an object can't be string and bytes at the same time, the function cannot return any value.

jakebailey commented 3 years ago

I'd agree, though in pyright we explicitly have Never for that type (as I think NoReturn has different semantics, but maybe not). Comparing TS intersections with non-overlapping types:

image

(Overall, I agree that given the way it's been described, AnyOf is not an intersection type; more of a workaround for people not liking to return Union because existing code is too trusting of functions with behavior that's hard to capture with overloads, and never actually verifies that they got the thing they wanted.)

BvB93 commented 2 years ago

Two more cases wherein AnyOf would be very useful:

Dealing with overload ambiguity

@overload
def func(a: Sequence[int]) -> str: ...
@overload
def func(a: Sequence[str]) -> int: ...

This recently came up in https://github.com/python/mypy/issues/11347: whenever ambigous overloads are encountered, e.g. when Sequence[Any] is passed in the example function above, mypy will generally return Any as it cannot safely pick either one of the overloads. With AnyOf this could be replaced with, e.g., AnyOf[str, int], which would provide quite a bit more type safety compared to plain Any.

Reduction of numpy arrays

The second case is more related to a pecularity of numpy, as operations involving numpy will rarelly return 0D arrays, aggressively converting the latter into their corresponding scalar type. This becomes problematic when reductions are involved, especially ones over multiple axes as this requires detailed knowledge of the original array-like objects' dimensionality.

While the variadics of PEP 646 (and any follow-ups) should alleviate this issue somewhat, there will, realistically, remain a sizable subset of array-like objects and axes (SupportsIndex | Sequence[SupportsIndex]) combinations wherein the best we can currently do is return Any. Replacing this with, for example, AnyOf[float64, NDArray[float64]] would be a massive improvement, especially since the signatures of numpys' scalar- and array-types have a pretty large overlap.

Avasam commented 1 year ago

How would AnyOf[SomeClass, Any/Unknow] be treated?

I ask because of complex cases like this: https://github.com/python/typeshed/pull/9461 Where we could do the following without having to rely on installing all 4 libraries.

from PyQt6.QtGui import QImage as PyQt6_QImage  # type: ignore[import]
from PyQt5.QtGui import QImage as PyQt5_QImage  # type: ignore[import]
from PySide6.QtGui import QImage as PySide6_QImage  # type: ignore[import]
from PySide2.QtGui import QImage as PySide2_QImage  # type: ignore[import]

def foo() -> AnyOf[PyQt6_QImage, PyQt5_QImage, PySide6_QImage, PySide2_QImage]: ...

Or in a non-stub file with type inference:

try:
  from PyQt6.QtGui import QImage  # type: ignore[import]
except:
  pass
try:
  from PyQt5.QtGui import QImage  # type: ignore[import]
except:
  pass
try:
  from PySide6.QtGUI import QImage  # type: ignore[import]
except:
  pass
try:
  from PySide2.QtGUI import QImage  # type: ignore[import]
except:
  pass

# If inference is not feasible. An explicit AnyOf return type like a bove would do.
def foo():
    return QImage()
Avasam commented 1 year ago

Thoughts for a different approach: If AnyOf is really only useful to be permissive on return types (and avoid having to do a bunch of manual type-narrowing). Then could could type-checkers simply have an option to treat unions in return types as permissive unions?

This way you can keep annotating the types accurately. No need for a new user-facing type to juggle with. And let the users choose whether they want total strictness or be more permissive.

Pytype tends to err on the permissive side. Mypy can probably already be done with a plugin.

Akuli commented 1 year ago

Sometimes you want to return the usual union: if a function returns str | None and you forget to check for None, that should be an error.