Already subscripted generic types incorrectly has `__class_getitem__` when inferred

yilei commented 2 years ago

Steps to reproduce

See following code:

import astroid
node = astroid.extract_node(
    """
    import typing
    typing.Tuple[int]  #@
    """)
for inferred in node.infer():
    print('inferred:', inferred)
    print('has __class_getitem__', '__class_getitem__' in inferred)

Current behavior

It prints out:

inferred: ClassDef.Tuple(name='Tuple',
               doc=None,
               decorators=None,
               bases=[<ClassDef.tuple l.0 at 0x7f9a6fb8b790>],
               keywords=[],
               body=[])
has __class_getitem__ True

Expected behavior

__class_getitem__ should not be set in locals because typing.Tuple[int] itself is not subscriptable.

`python -c "from astroid import pkginfo; print(pkginfo.version)"` output

2.9.3

yilei commented 2 years ago

Adding more context: we have a custom lint checker that wants to warn the used of non-subscripted generic types in annotations. For example:

CustomType1 = typing.Tuple
CustomType2 = typing.Tuple[int]

def func(a: CustomType1, b: CustomType2):
  pass

When we analyze func, we need to infer CustomType1 and CustomType2. But they are identical and we can't tell if it's already subscripted or not. Looks like this is due to this code where when it sees a Subscript, it simply tells it to ignore and let it infer the non-subscripted node.

gagern commented 2 years ago

@yilei you wrote that

__class_getitem__ should not be set in locals because typing.Tuple[int] itself is not subscriptable.

I disagree. Most importantly, an already subscripted generic type needs to be subscriptable again if at least one of its arguments is a type variable. Example:

from typing import Tuple, TypeVar
T = TypeVar('T')
U = TypeVar('U')
X = Tuple[Tuple[T, U, int], U, T]
Y = X[str, bool]  # ← Subscripting a second time!
assert Y == Tuple[Tuple[str, bool, int], bool, str]

So there are situations where it makes sense to subscript an already subscripted generic type a second time.

As for the details: on my Python 3.9.9, typing.Tuple is an instance of typing._TupleType. So it wouldn't have a __class_getitem__ to begin with. https://github.com/python/cpython/blob/4dc746310bd37ad6b381f9176acd167d445f4385/Lib/typing.py#L2300 still appears to be that way. I could start with tuple which does have __class_getitem__. So does type(tuple[int]) which is of type types.GenericAlias. Yes, subscripting a second time raises a TypeError at run-time, but I think it is totally legitimate for the method to be statically available so that it can deal with the case of a non-empty list of type parameters.

As far as I can tell, for the kind of test that you describe you would be better off getting the value of the type annotation, and inspecting its __parameters__ property. https://www.python.org/dev/peps/pep-0585/#parameters-to-generics-are-available-at-runtime has more details on that. Seems it should be available starting with Python 3.9.

I don't know how all of this will fit in with evaluation under Astroid, nor what it would take to make this work on older versions of Python. I have just started reading the code for that. Perhaps we can extend brain_typing.py to manipulate __parameters__ in the way Python 3.9+ would? That should allow keeping track of whether a type is fully specified or still generic.

yilei commented 2 years ago

@gagern you are right, this requires more advanced analysis.

In Python 3.9+, it also has the same issue for the builtin collections classes like dict, tuple, list etc now that they also support subscriptions.

pylint-dev / astroid