Open Dr-Irv opened 3 years ago
For type-checking tests, by adding the return type -> None
, mypy will type-check it. I think all that would remain is to type-hint pytest fixtures and parameters.
Also, adding # type: ignore
is an additional test; our CI will fail if a type-ignore is not necessary.
pyright
is a bit daft IMO. It complains about things like
self.some_int = int(val)
which can only be an int.
3. Most likely, the best way to test if we have all the overloads correct is by fully typing our
tests
code, and adding# ignore
comments when we are specifically testing for incorrect types.
see also #40202 for a POC of a more explicit and comprehensive way of testing overloads
@Dr-Irv IIUC we're doing this now. is this issue still active?
@Dr-Irv IIUC we're doing this now. is this issue still active?
I created this issue as a reference so that we could identify which parts of the pandas source are missing type declarations.
So it is still valid, unless we feel that all of the pandas source now has type declarations (which I don't think is true).
I did edit the description to refer to pandas-stubs
.
This describes a procedure for using the command line tool
pyright
(https://github.com/microsoft/pyright/blob/master/docs/command-line.md) to identify places in the pandas code that are missing type declarations. xref #28142py.typed
in the same folder aspandas\__init__.py
cd
to the folder containingREADME.md
from pandas, and typepyright --verifytypes pandas! > pyright.out
verifytypes.py
which can be run from the command line aspython verifytypes.py
and will print the top 20 modules that need fixing.Open issues for adding types:
pyright
to determine where thing are missing will not determine if we are missing appropriate overloads. See example below.tests
code, and adding# ignore
comments when we are specifically testing for incorrect types.verifytypes.py utility
```python import subprocess import json import pandas as pd def getpyrightout() -> bytes: try: pyrightout = subprocess.run( ["pyright", "--outputjson", "--verifytypes", "pandas!"], capture_output=True, shell=True, ) except Exception as e: raise e return pyrightout.stdout def processjson(jsonstr: bytes): d = json.loads(jsonstr) msgsSeries = pd.Series([k["message"] for k in d["diagnostics"]]) msgsdf = msgsSeries.str.split('"', n=2, expand=True) msgsdf.columns = ["primary", "element", "extra"] typemsgs = msgsdf[msgsdf.primary.str.startswith("Type")].copy() typemsgs["module"] = typemsgs["element"].str.replace(r"\.[A-Z][a-z_A-Z\.]*$", "") notest = typemsgs[~typemsgs.module.str.startswith("pandas.tests")] print( notest.groupby(["module", "primary"]) .size() .sort_values(ascending=False) .head(20) ) if __name__ == "__main__": processjson(getpyrightout()) ```Example using DataFrame.rename() where overloads are needed
This is taken from https://github.com/microsoft/python-type-stubs/blob/main/pandas/core/frame.pyi ```python @overload def fillna( self, value: Optional[Union[Scalar, Dict, Series, DataFrame]] = ..., method: Optional[Literal["backfill", "bfill", "ffill", "pad"]] = ..., axis: Optional[AxisType] = ..., limit: int = ..., downcast: Optional[Dict] = ..., *, inplace: Literal[True] ) -> None: ... @overload def fillna( self, value: Optional[Union[Scalar, Dict, Series, DataFrame]] = ..., method: Optional[Literal["backfill", "bfill", "ffill", "pad"]] = ..., axis: Optional[AxisType] = ..., limit: int = ..., downcast: Optional[Dict] = ..., *, inplace: Literal[False] = ... ) -> DataFrame: ... @overload def fillna( self, value: Optional[Union[Scalar, Dict, Series, DataFrame]] = ..., method: Optional[Union[_str, Literal["backfill", "bfill", "ffill", "pad"]]] = ..., axis: Optional[AxisType] = ..., *, limit: int = ..., downcast: Optional[Dict] = ..., ) -> Union[None, DataFrame]: ... @overload def fillna( self, value: Optional[Union[Scalar, Dict, Series, DataFrame]] = ..., method: Optional[Union[_str, Literal["backfill", "bfill", "ffill", "pad"]]] = ..., axis: Optional[AxisType] = ..., inplace: Optional[_bool] = ..., limit: int = ..., downcast: Optional[Dict] = ..., ) -> Union[None, DataFrame]: ... ```