python / mypy

Optional static typing for Python
17.89k stars 2.74k forks source link

Consider special casing zip(*tuples) #5247

Open JukkaL opened 6 years ago

JukkaL commented 6 years ago

This seems to be a somewhat common zip idiom:

>>> a = [('a', 1), ('b', 2), ('c', 3)]
>>> list(zip(*a))
[('a', 'b', 'c'), (1, 2, 3)]

The inferred type for zip(*a) is Iterator[Tuple[Any, ...]], which causes false negatives in multiple assignments like this:

x, y = zip(*a)

Maybe we could add a plugin feature that can provide more precise types for zip(*a) in a multiple assignment context somehow. For example, the inferred type in the above example could be Tuple[Tuple[str, ...], Tuple[int, ...]] instead of Iterator[...], but only in the context of a multiple assignment since elsewhere the inferred type is not safe to use.

gvanrossum commented 6 years ago

Or we could have an (internal) representation for a diversified list? Somehow similar to TypedDict.

JukkaL commented 6 years ago

In this case we'd need an iterator over a tuple of items. For example, TupleIterator[int, str] could be an iterator that can be iterated exactly twice, and the first item will be an integer and the second one a string. I'm skeptical about this being useful enough to be worth adding as a type system extension, since any new type requires a substantial amount of work and the complexity of the type system grows in a non-linear fashion. It might be enough to add ad hoc support for the few most common use cases through plugins.

gvanrossum commented 6 years ago

OK, got it.

elazarg commented 6 years ago

Why isn't it a TupleType? I think that the concept of TupleType should have little to do with tuples specifically, they are just a class that we know should behave like one (and this is why we have a fallback, right? it's the intersection, so this iterator is the intersection of the iterator protocol and a TupleType)

JukkaL commented 6 years ago

@elazarg An interesting idea! It's not strictly a tuple type since tuples support indexing with integers but this doesn't work with an iterator. However, we could perhaps add an extra check so that TupleType indexing only works if the fallback provides __getitem__. TupleType would mean "supports tuple interface for all methods provided by the fallback type". We'd need to do this for each special case operation supported for TupleType, which would complicate things a bit.

This would still need a plugin since there's no syntax for defining a type that is tuple-like but doesn't extend the tuple class.

CarliJoy commented 6 months ago

Any news on this?

For the moment I helped myself with this (limited) function. I tried TypeVarTuple but couldn't get it to work.

from typing import TypeVar, Iterable

T1 = TypeVar("T1")
T2 = TypeVar("T2")
T3 = TypeVar("T3")
T4 = TypeVar("T4")
T5 = TypeVar("T5")

def transpose(
    iterable: Iterable[tuple[T1, T2, T3, T4, T5]], strict: bool = False
) -> tuple[Iterable[T1], Iterable[T2], Iterable[T3], Iterable[T4], Iterable[T5]]:

def transpose(
    iterable: Iterable[tuple[T1, T2, T3, T4]], strict: bool = False
) -> tuple[Iterable[T1], Iterable[T2], Iterable[T3], Iterable[T4]]:

def transpose(
    iterable: Iterable[tuple[T1, T2, T3]], strict: bool = False
) -> tuple[Iterable[T1], Iterable[T2], Iterable[T3]]:

def transpose(
    iterable: Iterable[tuple[T1, T2]], strict: bool = False
) -> tuple[Iterable[T1], Iterable[T2]]:

def transpose(
    iterable: (
        Iterable[tuple[T1, T2]]
        | Iterable[tuple[T1, T2, T3]]
        | Iterable[tuple[T1, T2, T3, T4]]
        | Iterable[tuple[T1, T2, T3, T4, T5]]
    strict: bool = False,
) -> (
    tuple[Iterable[T1], Iterable[T2]]
    | tuple[Iterable[T1], Iterable[T2], Iterable[T3]]
    | tuple[Iterable[T1], Iterable[T2], Iterable[T3], Iterable[T4]]
    | tuple[Iterable[T1], Iterable[T2], Iterable[T3], Iterable[T4], Iterable[T5]]
    Transpose the elements of given iterable, type safe

    Only a typed shortcut for zip(*iterable)
    See for background
    return zip(*iterable, strict=strict)  # type: ignore

Cross posted at stackoverflow

finite-state-machine commented 4 months ago

In case it's helpful to have a gist for this:

from __future__ import annotations
from typing_extensions import *

def is_even(value: int) -> bool:
    return not (value % 2)

table = [(i, is_even(i), str(i)) for i in range(20)]
assert_type(table, List[Tuple[int, bool, str]])

numbers, evens, strs = zip(*table)

#                         we get...         we hoped for...
#                         ───────────────   ────────────────
reveal_type(numbers)    # Tuple[Any, ...]   Tuple[int, ...]
reveal_type(evens)      # Tuple[Any, ...]   Tuple[bool, ...]
reveal_type(strs)       # Tuple[Any, ...]   Tuple[str, ...]

# these statements hold true at runtime:
assert numbers == tuple(range(20))
assert evens == tuple(is_even(i) for i in range(20))
assert strs == tuple(str(i) for i in range(20))

# these all fail today:
assert_type(numbers, Tuple[int, ...])
assert_type(numbers, Tuple[bool, ...])
assert_type(numbers, Tuple[str, ...])