pandas-dev / pandas-stubs

Public type stubs for pandas
BSD 3-Clause "New" or "Revised" License
228 stars 120 forks source link

result of itertuples gives an error when passed to NamedTuple - incompatible type "_PandasNamedTuple"; expected "NamedTuple" #881

Open roj516 opened 6 months ago

roj516 commented 6 months ago

Describe the bug

Code that passed the output of itertuples to functions expecting NamedTuple fails in mypy with incompatible type "_PandasNamedTuple"; expected "NamedTuple" (Maybe related to the fix for issue #834)

To Reproduce

% python3.12 -m venv venv_pandas_stubs
% source venv_pandas_stubs/bin/activate
% pip install --upgrade pip
% pip install mypy pandas pandas-stubs

Collecting mypy
  Downloading mypy-1.8.0-cp312-cp312-macosx_11_0_arm64.whl.metadata (1.8 kB)
Collecting pandas
  Downloading pandas-2.2.1-cp312-cp312-macosx_11_0_arm64.whl.metadata (19 kB)
Collecting pandas-stubs
  Downloading pandas_stubs-2.2.0.240218-py3-none-any.whl.metadata (9.5 kB)

% cat foo.py 

import pandas as pd
from typing import NamedTuple

def process_tuple(t: NamedTuple) -> None:
  pass

def process_dataframe(df: pd.DataFrame) -> None:
  for t in df.itertuples():
    process_tuple(t)

% mypy foo.py                         
foo.py:11: error: Argument 1 to "process_tuple" has incompatible type "_PandasNamedTuple"; expected "NamedTuple"  [arg-type]

% pip install pandas-stubs==2.0.3.230814
% mypy foo.py                           
Success: no issues found in 1 source file

Please complete the following information:

Additional context mypy has special casing for NamedTuple - https://mypy.readthedocs.io/en/latest/kinds_of_types.html#named-tuples

Dr-Irv commented 6 months ago

So I'm not sure if we'll be able to address this for 2 reasons: 1) The special feature of a raw NamedTuple “pseudo-class” that mypy has for NamedTuple has the comment "Note that this behavior is highly experimental, non-standard, and may not be supported by other type checkers and IDEs." 2) The python typing system doesn't really support having a subclass of NamedTuple with undefined fields, which is what we would need to return in this case.

From a static typing perspective, we know we are returning a kind of "generic" named tuple, but we don't know what the fields are, since it has to work for any DataFrame