pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.71k stars 17.92k forks source link

BUG: invalid type annotation for equal operator of Series #40762

Open ikokostya opened 3 years ago

ikokostya commented 3 years ago

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd

df = pd.DataFrame(
    {'kind' : ['a', 'b', 'a'], 
     'value' : [1, 2, 3]})

res: pd.Series = df['kind'] == 'a'
print(type(res))

Problem description

mypy returns the following error:

$ mypy test.py 
test.py:7: error: Incompatible types in assignment (expression has type "bool", variable has type
"Series")
    res: pd.Series = df['kind'] == 'a'
                     ^
Found 1 error in 1 file (checked 1 source file)
$ mypy --version
mypy 0.812

But actual type of the result at runtime is Series:

$ python test.py 
<class 'pandas.core.series.Series'>

Expected Output

No compile errors.

Output of pd.show_versions()

``` >>> pd.show_versions() INSTALLED VERSIONS ------------------ commit : f2c8480af2f25efdbd803218b9d87980f416563e python : 3.8.5.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-70-generic Version : #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.2.3 numpy : 1.20.2 pytz : 2021.1 dateutil : 2.8.1 pip : 20.0.2 setuptools : 44.0.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None ```
jreback commented 3 years ago

you almost certainly need to run on master with mypy ; we don't have much in 1.2.x

ikokostya commented 3 years ago

I have the same problem with master branch.

mzeitlin11 commented 3 years ago

Thanks for looking into this @ikokostya, investigations welcome to fix! (if I run this on master, I'm seeing an inferred type of Any, so no error, but regardless typing Series.__eq__ to return a Series would be great)

jbrockmendel commented 2 years ago

not sure how well mypy does with dynamically generated methods, but the place these methods are defined is in core.ops.__init__, so I think annotate flex_method_SERIES's flex_wrapper