Open srittau opened 6 years ago
Another use case - at least util we get a more flexible Callable
syntax - is for optional arguments in callback functions, like WSGI's start_response()
:
StartResponse = AnyOf[
Callable[[str, List[(str, str)]], None],
Callable[[str, List[(str, str)], ExcInfo], None]
]
def create_start_response() -> StartResponse:
...
create_start_response()("200 OK", [])
Using Union
this causes a type error. (Too few arguments.)
This would be as unsafe as Any
though right? E.g. def foo() -> Union[int, str]
-- we have no idea whether foo() + 1
is safe or not. Sure, it can tell you that foo().append(1)
is definitely wrong, but that's pretty minor.
Similarly we won't know if create_start_response()("200 OK", [], sys.exc_info())
will be accepted or not. If you meant to describe a situation where it returns a callback that can be called with or without the extra argument, there's already a solution: https://mypy.readthedocs.io/en/latest/additional_features.html?highlight=mypy_extensions#extended-callable-types.
Good to know that there is a proper solution for the callback case!
Personally, I think the improvement in type safety over just returning Any
would be worth it. It surely can't catch all problematic cases, but some is better than none. And the issue seems to crop up from time to time, for example just after I opened this in python/typeshed#2271. That the problem of returning Union
s is also explicitly mentioned in the docs is also noteworthy, I think.
In my experience I never had a situation where I needed unsafe unions. Anyway, I could imagine some people might want it. However, the problem is that the benefits of unsafe unions are incomparable with the amount of work to implement them. Adding a new kind of types to mypy e.g. could take months of intense work. This one will be as hard as introducing intersection types, and while the later are more powerful (they cover most of use cases of unsafe unions) we still hesitate to start working on it.
I'm with Ivan, and this is something we've considered earlier -- see the discussion at https://github.com/python/mypy/issues/1693, for example. The relatively minor benefits don't really seem worth the extra implementation work and complexity.
This would only be a potentially good idea for legacy APIs that can't be typed precisely right now, and the most popular of those can be special cased by tools (e.g. through mypy plugins). Mypy already special cases open
and a few other stdlib functions. Alternatively, we might be able to use some other type system improvements, such as literal types, once they become available. For new code and new APIs the recommended way is to avoid signatures that would require the use of AnyOf
anyway.
Ad-hoc extensions have the benefit of being easy to implement. They are also modular, don't complicate the rest of the type system, and they potentially allow inferring precise return types.
There is also often a simple workaround -- write a wrapper function around the library function with an Any
return that has a precise return type, by restricting the arguments so that the return type can be known statically. Example:
def open_text(path: str, mode: str) -> TextIO:
assert 'b' not in mode
return open(path, mode)
def open_binary(path: str, mode: str) -> BinaryIO:
assert 'b' not in mode
return open(path, mode + 'b')
Some use cases (such as open
) can now be supported pretty well by using overloads and literal types (PEP 586).
This continues to crop up with typeshed pretty regularly. For a lot of return types, typeshed either has to either make a pretty opinionated decision or completely forgo type safety with Any.
One use case I find pretty compelling is for autocomplete in IDEs, eg: https://github.com/python/typeshed/issues/4511 I seem to recall PyCharm used unsafe unions and I'd imagine this is a big reason why.
From a typeshed perspective, it would be nice to support these use cases. From a mypy perspective, I agree it's maybe not worth the effort, so maybe type checkers could interpret a potential AnyOf as a plain Any.
Maybe you could bring this up on typing-sig? A proto-pep might get support there.
Or maybe you can spell this using -> Annotated[Any, T1, T2]
where T1, T2
are the possible return types? Then type checkers will treat it as Any but other tools could in theory interpret this as AnyOf[T1, T2]
. Or is that too hacky?
I brought this up on typing-sig.
Semantically, would AnyOf
be equivalent to Intersection
as proposed in #213? My intuition is yes: an operation on an AnyOf
type should be valid if it is valid on at least one of the component types.
Guido's thoughts on the subject: https://mail.python.org/archives/list/typing-sig@python.org/message/TTPVTIKZ6BFVWZBUYR2FN2SPGB63Z7PH/
edited out misleading tldr
There's probably also some slightly different behaviour when intersecting slightly incompatible types. E.g., for an intersection maybe you'd want to treat intersection order like an MRO, but for AnyOf you'd probably want "is compatible with any of the intersection"
I see, thanks for reminding me of that email! I suppose this matters when you're implementing a function with an AnyOf
return type. In typeshed we could just write:
def some_func() -> Intersection[bytes, str]: ...
And it would work as expected.
But when implementing it, you'd write:
def some_func() -> Intersection[bytes, str]:
if something:
return bytes()
else:
return str()
And a type checker would flag the return type as incompatible. So in Guido's terminology, AnyOf would have to behave like Union in a "receiving" position and like Intersection in a "giving" position.
Maybe I'm misunderstanding how intersections are supposed to work, but to me an intersection type is a type that fulfills the protocol of multiple other types. At least that's how e.g. typescript and the Ivan's initial comment in #213 describe it. Intersection[bytes, str]
wouldn't make much sense to me, because it would mean that the returned type is both a str
and a bytes
. An intersection lets you "compose" multiple types into one, which is why I like the Foo & Bar
syntax for it (also like typescript and in comparison to |
for union).
And that means that AnyOf
has not much relation to intersections. Like Union
, it's more meant to be an "either/or" situation. For example, the following would work with AnyOf
, but not with Union
(which why AnyOf
is unsafe, but still much safer than Any
):
def foo(b: bytes): ...
x: AnyOf[str, bytes]
y: str | bytes
foo(x) # ok
foo(y) # error
And sorry for the spam, but one last thought:
For the caller of a function, there is no difference, whether an argument is annotated with AnyOf
of Union
. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.
For the caller of a function, there is no difference, whether an argument is annotated with
AnyOf
ofUnion
. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.
It would be for the benefit of the callee. Conversely, for the callee there's no reason to return an AnyOf, since for them a Union works as well.
Taking your example, the connection between AnyOf and Intersection is that if we had
def foo(b: bytes): ...
x: str & bytes
foo(x)
would work as well. But presumably, to give x a value, you'd want to work either of these:
x: AnyOf[str, bytes]
x = "" # ok
x = b"" # also ok
And there it behaves like Union. Combining this, we can have:
def foo(b: bytes): ...
x: AnyOf[str, bytes]
# This works:
x = b""
foo(x)
# This works too (i.e. doesn't cause a static type error):
x = ""
foo(x)
At this point I would just start repeating what I said in that email, so I'll stop here.
So to recap. For argument types:
# For the callee (receiver), AnyOf and Intersection are equivalent:
def foo1(x: AnyOf[str, StringIO]):
do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
def foo2(x: str & StringIO):
do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
# But Union isn't:
def foo3(x: str | StringIO):
do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x) # error
# But for the caller (giver) AnyOf and Union are equivalent:
foo1("") # AnyOf ok
foo2("") # Intersection error
foo3("") # Union ok
For return types:
# For the callee (giver), AnyOf and Union are equivalent:
def foo1() -> AnyOf[str, bytes]:
return "" # ok
def foo2() -> str & bytes:
return "" # error
def foo3() -> str | bytes:
return "" # ok
# For the caller (receiver), AnyOf and Intersection are equivalent:
x1: str = foo1("") # AnyOf ok
x2: str = foo2("") # Intersection ok
x3: str = foo3("") # Union error
Which means that in stubs (where there's no callee), Union
and Intersection
are sufficient, but AnyOf
would still be needed for implementations. (Jelle's point, I think.)
I don't understand why def foo() -> str & bytes
should be any different from def foo() -> NoReturn
. For -> str & bytes
, a type checker could deduce that because an object can't be string and bytes at the same time, the function cannot return any value.
I'd agree, though in pyright we explicitly have Never
for that type (as I think NoReturn
has different semantics, but maybe not). Comparing TS intersections with non-overlapping types:
(Overall, I agree that given the way it's been described, AnyOf
is not an intersection type; more of a workaround for people not liking to return Union
because existing code is too trusting of functions with behavior that's hard to capture with overloads, and never actually verifies that they got the thing they wanted.)
Two more cases wherein AnyOf
would be very useful:
@overload
def func(a: Sequence[int]) -> str: ...
@overload
def func(a: Sequence[str]) -> int: ...
This recently came up in https://github.com/python/mypy/issues/11347: whenever ambigous overloads are encountered, e.g. when Sequence[Any]
is passed in the example function above, mypy will generally return Any
as it cannot safely pick either one of the overloads. With AnyOf
this could be replaced with, e.g., AnyOf[str, int]
, which would provide quite a bit more type safety compared to plain Any
.
The second case is more related to a pecularity of numpy, as operations involving numpy will rarelly return 0D arrays, aggressively converting the latter into their corresponding scalar type. This becomes problematic when reductions are involved, especially ones over multiple axes as this requires detailed knowledge of the original array-like objects' dimensionality.
While the variadics of PEP 646 (and any follow-ups) should alleviate this issue somewhat, there will, realistically, remain a sizable subset of array-like objects and axes (SupportsIndex | Sequence[SupportsIndex]
) combinations wherein the best we can currently do is return Any
. Replacing this with, for example, AnyOf[float64, NDArray[float64]]
would be a massive improvement, especially since the signatures of numpys' scalar- and array-types have a pretty large overlap.
How would AnyOf[SomeClass, Any/Unknow]
be treated?
I ask because of complex cases like this: https://github.com/python/typeshed/pull/9461 Where we could do the following without having to rely on installing all 4 libraries.
from PyQt6.QtGui import QImage as PyQt6_QImage # type: ignore[import]
from PyQt5.QtGui import QImage as PyQt5_QImage # type: ignore[import]
from PySide6.QtGui import QImage as PySide6_QImage # type: ignore[import]
from PySide2.QtGui import QImage as PySide2_QImage # type: ignore[import]
def foo() -> AnyOf[PyQt6_QImage, PyQt5_QImage, PySide6_QImage, PySide2_QImage]: ...
Or in a non-stub file with type inference:
try:
from PyQt6.QtGui import QImage # type: ignore[import]
except:
pass
try:
from PyQt5.QtGui import QImage # type: ignore[import]
except:
pass
try:
from PySide6.QtGUI import QImage # type: ignore[import]
except:
pass
try:
from PySide2.QtGUI import QImage # type: ignore[import]
except:
pass
# If inference is not feasible. An explicit AnyOf return type like a bove would do.
def foo():
return QImage()
Thoughts for a different approach:
If AnyOf
is really only useful to be permissive on return types (and avoid having to do a bunch of manual type-narrowing). Then could could type-checkers simply have an option to treat unions in return types as permissive unions?
This way you can keep annotating the types accurately. No need for a new user-facing type to juggle with. And let the users choose whether they want total strictness or be more permissive.
Pytype tends to err on the permissive side. Mypy can probably already be done with a plugin.
Sometimes you want to return the usual union: if a function returns str | None
and you forget to check for None
, that should be an error.
Sometimes a function or method can return one of several types, depending on the passed in arguments or external factors. Best example is probably
open()
, but there are other examples in the standard library, likeshutil.copy()
[1]. In many of those cases, the caller knows what return type to expect.Currently there are several options, but none of them is really satisfactory:
@overload
. This is the best solution if it can be used. But that is often not the case, like in the examples above.Union
as the return type. This is usually not recommended, since it means that the caller needs to useisinstance()
to use the return type.Any
as the return type. This is currently best practice in those cases, but of course provides no type safety at all.Therefore, I propose to add another type, for example
AnyOf[...]
that acts likeUnion
, but can be used everywhere any of its type arguments could be used.I also think that the documentation should make it clear that using
AnyOf
is a code smell.[1] Currently the type behaviour in
shutil
is broken in my opinion, but that does not change the fact that currently it is as it is.