agronholm / typeguard

Run-time type checker for Python
Other
1.5k stars 112 forks source link

from __future__ import annotations ignored by typeguard #426

Open MyKo101 opened 8 months ago

MyKo101 commented 8 months ago

Things to check first

Typeguard version

4.1.5

Python version

3.11.0

What happened?

When attempting to type-hint a parametrised pandas series, typeguard is failing for generic annotations.

In my MWE, I have a function which takes a float pandas series as an argument and returns a float:

def col_sum(x: pd.Series[float]) -> float:
    return x.sum()

If I import this function after applying install_import_hook(), I get the following error:

>   def col_sum(x: pd.Series[float]) -> float:
E   TypeError: type 'Series' is not subscriptable

This error implies that the pd.Series is being imported from pandas as a type, rather than a generic type (i.e. from pandas-stubs).

This error occurs despite my script file having the from __future__ import annotations import statement (which should solve this problem).

For others with this issue, my current workaround is to create a type alias in my script file and use if TYPE_CHECKING:, but is very clunky:

from typing import TypeVar
T = TypeVar('T')
if TYPE_CHECKING:
    pds[T] = pd.Series[T] | pd.Series[T]
else:
    pds[T] = pd.Series | pd.Series

For context, I am running a test script using pytest. The tests pass under normal situations, but when I run pytest --typeguard-package=MyPackage, it is failing with the above error. I need to do this workaround in order to pass both mypy and typeguard testing as part of my current testing suite, based on the hypermodern template. Mypy requires that the generic type pd.Series has a type parameter applied to it.

How can we reproduce the bug?

I have created an MWE package with the following file structure:

├>.venv
│ └─...
├>src
│ └>MyPackage
│   └─__init__.py
├─main.py
├─main_tg.py
└─pyproject.toml

The src/MyPackage/__init__.py file contains:

from __future__ import annotations
import pandas as pd

def col_sum(x: pd.Series[float]) -> float:
    return x.sum()

The main.py file contains:

from MyPackage import col_sum
from pandas import Series

col_sum(Series([1.0, 2.0, 3.0]))

The main_tg.py file contains:

from typeguard import install_import_hook
install_import_hook("MyPackage")
from MyPackage import col_sum
from pandas import Series

col_sum(Series([1.0, 2.0, 3.0]))

The pyproject.toml file contains:

[project]
name = "MyPackage"
version = "0.0.1"
description = "An MWE Package"

In the console (Windows), I have run:

py -m venv %cd%\.venv
.venv\Scripts\activate
pip install pandas
pip install typeguard
pip install -e .

Then running py main.py is fine, but py main_tg.py causes the error.

gboeing commented 8 months ago

Same problem here.

If using from __future__ import annotations and pandas.Series in my code and python-stubs installed in my python environment, mypy and typeguard validation are in direct conflict with each other. The former demands a type subscript on the generic Series type, but the latter raises a TypeError saying Series is unsubscriptable. This is also with typeguard 4.1.5 and Python 3.11.

I suspect this is a common use case. Note there is a similar question from someone else on StackOverflow.

@agronholm apologies if I am just missing something dumb... but how does one pass both mypy and typeguard validation with pd.Series in the code?

agronholm commented 8 months ago

The fundamental problem is that typeguard is a run-time type checker, so if a library hides the generic aspects of its classes, it's tough for a run-time type checker to do its work. There's probably a solution somewhere, but this project would really need somebody at the helm who could dedicate a lot of time to it, and that's unfortunately not me.

sth commented 2 months ago

As a fairly practical workaround for problems like TypeError: type 'Series' is not subscriptable, I use this helper function:

from types import GenericAlias

@classmethod  # type: ignore[misc] # "classmethod" used with a non-method
def _class_getitem_dummy(cls: type[object], key: type | tuple[type]) -> GenericAlias:
    return GenericAlias(cls, key)

def ensure_generic(Cls: type) -> None:
    if hasattr(Cls, "__class_getitem__"):
        return

    Cls.__class_getitem__ = _class_getitem_dummy  # type: ignore[attr-defined]

This can be called with any class to make it subscriptable:

ensure_generic(SomeType)