pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
BSD 3-Clause "New" or "Revised" License
43.37k stars 17.83k forks source link

TYP: investigate/fix ignored mypy errors #37715

Closed simonjayhawkins closed 10 months ago

simonjayhawkins commented 3 years ago

In #37556 # type: ignore were added to silence mypy errors to be fixed 'later'. If an ignore is needed due to a error with the type checker, a comment with a reference to the mypy issue on the github mypy issue tracker should be included.

Further investigation or PRs welcome removing these ignores or adding comments with links to mypy issues if applicable. PRs should include 'xref #37715'

PRs could address just one error, a small handful of related errors or a complete module.

grep -Ern "type: ?ignore" pandas/ currently gives 207 matches

``` pandas/tests/internals/ assert df._mgr.blocks[0].values is arr # type:ignore[union-attr] pandas/tests/indexes/object/ (pd.IndexSlice["b":"y":-1], ""), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["b"::-1], "b"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice[:"b":-1], "yxdcb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice[:"y":-1], "y"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["y"::-1], "yxdcb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["y"::-4], "yb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice[:"a":-1], "yxdcb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice[:"a":-2], "ydb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["z"::-1], "yxdcb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["z"::-3], "yc"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["m"::-1], "dcb"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice[:"m":-1], "yx"), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["a":"a":-1], ""), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["z":"z":-1], ""), # type: ignore[misc] pandas/tests/indexes/object/ (pd.IndexSlice["m":"m":-1], ""), # type: ignore[misc] pandas/tests/frame/indexing/ + ["datetime64[ns, UTC]", "period[D]"], # type: ignore[list-item] pandas/tests/io/xml/ filename, names="Col1, Col2, Col3", parser=parser # type: ignore[arg-type] pandas/ params=[pd.Index, pd.Series], ids=["index", "series"] # type: ignore[list-item] pandas/_testing/ args = [] # type: ignore[assignment] pandas/_testing/ errno = getattr(err.reason, "errno", None) # type: ignore[attr-defined] pandas/_testing/ idx = idx_func(nentries) # type: ignore[operator] pandas/util/ return wrapper # type: ignore[return-value] pandas/util/ func.__signature__ = sig # type: ignore[attr-defined] pandas/util/ docstring._docstring_components # type: ignore[union-attr] pandas/util/ decorated._docstring_components = ( # type: ignore[attr-defined] pandas/core/ new_values, *new_axes # type: ignore[arg-type] pandas/core/ return self.rename(**mapper) # type: ignore[return-value, arg-type] pandas/core/ return self.rename(**mapper) # type: ignore[return-value, arg-type] pandas/core/ cls.any = any # type: ignore[assignment] pandas/core/ cls.all = all # type: ignore[assignment] pandas/core/ NDFrame.mad.__doc__, # type: ignore[arg-type] pandas/core/ cls.mad = mad # type: ignore[assignment] pandas/core/ cls.sem = sem # type: ignore[assignment] pandas/core/ cls.var = var # type: ignore[assignment] pandas/core/ cls.std = std # type: ignore[assignment] pandas/core/ cls.cummin = cummin # type: ignore[assignment] pandas/core/ cls.cummax = cummax # type: ignore[assignment] pandas/core/ cls.cumsum = cumsum # type: ignore[assignment] pandas/core/ cls.cumprod = cumprod # type: ignore[assignment] pandas/core/ cls.sum = sum # type: ignore[assignment] pandas/core/ = prod # type: ignore[assignment] pandas/core/ cls.mean = mean # type: ignore[assignment] pandas/core/ cls.skew = skew # type: ignore[assignment] pandas/core/ cls.kurt = kurt # type: ignore[assignment] pandas/core/ cls.median = median # type: ignore[assignment] pandas/core/ cls.max = max # type: ignore[assignment] pandas/core/ cls.min = min # type: ignore[assignment] pandas/core/ return self._inplace_method(other, type(self).__add__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__sub__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__mul__) # type: ignore[operator] pandas/core/ other, type(self).__truediv__ # type: ignore[operator] pandas/core/ other, type(self).__floordiv__ # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__mod__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__pow__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__and__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__or__) # type: ignore[operator] pandas/core/ return self._inplace_method(other, type(self).__xor__) # type: ignore[operator] pandas/core/dtypes/ @classmethod # type: ignore[misc] pandas/core/dtypes/ return arr.astype("int64", copy=copy, casting="safe") # type: ignore[call-arg] pandas/core/dtypes/ return arr.astype("uint64", copy=copy, casting="safe") # type: ignore[call-arg] pandas/core/dtypes/ dtype, fill_value, type(fill_value) # type: ignore[arg-type] pandas/core/dtypes/ unit, tz = unit.unit, # type: ignore[attr-defined] pandas/core/arrays/ other = self._scalar_type(other) # type: ignore[call-arg] pandas/core/arrays/ fill_value = self._scalar_type(fill_value) # type: ignore[call-arg] pandas/core/arrays/ new_fill = self._scalar_type(fill_value) # type: ignore[call-arg] pandas/core/arrays/ value = self._scalar_type(value) # type: ignore[call-arg] pandas/core/arrays/ return self._resolution_obj.attrname # type: ignore[union-attr] pandas/core/arrays/ uniques = uniques[::-1] # type: ignore[assignment] pandas/core/arrays/ self._dtype = StringDtype() # type: ignore[assignment] pandas/core/arrays/ @property # type: ignore[misc] pandas/core/arrays/ def type(self) -> Type: # type: ignore[override] pandas/core/arrays/ @property # type: ignore[misc] pandas/core/arrays/ left = self._combined.view("complex128") # type:ignore[attr-defined] pandas/core/arrays/ [t.numpy_dtype for t in dtypes], [] # type: ignore[union-attr] pandas/core/internals/ def axes(self) -> List[Index]: # type: ignore[override] pandas/core/internals/ if hasattr(arr, "tz") and is None: # type: ignore[union-attr] pandas/core/internals/ arr = arr._data # type: ignore[union-attr] pandas/core/internals/ arr = arr._data # type: ignore[union-attr] pandas/core/ @Appender(base.IndexOpsMixin.array.__doc__) # type: ignore[misc] pandas/core/ self._mgr = self._mgr.setitem( # type: ignore[assignment] pandas/core/ new_func[k] = [v] # type:ignore[list-item] pandas/core/ result = type(values)._simple_new( # type: ignore[attr-defined] pandas/core/groupby/ return ci.set_categories(c.categories) # type: ignore[attr-defined] pandas/core/groupby/ return ci.add_categories(new_cats) # type: ignore[attr-defined] pandas/core/groupby/ subset = self.obj # type: ignore[attr-defined] pandas/core/groupby/ groupby = self._groupby[key] # type: ignore[attr-defined] pandas/core/groupby/ groupby = self._groupby # type: ignore[attr-defined] pandas/core/groupby/ subset, groupby=groupby, parent=self, **kwargs # type: ignore[call-arg] pandas/core/groupby/ self.grouper, _, self.obj = get_grouper( # type: ignore[type-var] pandas/core/groupby/ return self.grouper.groups # type: ignore[union-attr] pandas/core/indexes/ @staticmethod # type: ignore[misc] pandas/core/indexes/ DatetimeLikeArrayMixin._hasnans.fget # type: ignore[attr-defined] pandas/core/indexes/ @property # type:ignore[misc] pandas/core/indexes/ @property # type:ignore[misc] pandas/core/indexes/ @property # type:ignore[misc] pandas/core/indexes/ ("ordered", self.ordered), # type: ignore[attr-defined] pandas/core/indexes/ name = # type: ignore[union-attr, attr-defined] pandas/core/indexes/ __setitem__ = __setslice__ = _disabled # type: ignore[assignment] pandas/core/indexes/ __delitem__ = __delslice__ = _disabled # type: ignore[assignment] pandas/core/indexes/ pop = append = extend = _disabled # type: ignore[assignment] pandas/core/indexes/ remove = sort = insert = _disabled # type: ignore[assignment] pandas/core/indexes/ self._names[lev] = name # type: ignore[has-type] pandas/core/ return values # type: ignore[return-value] pandas/core/reshape/ join_names = [] # type: ignore[var-annotated] pandas/core/ops/ res_values = filler(res_values) # type: ignore[operator] pandas/core/ mem = self.memory_usage(deep=True) # type: ignore[attr-defined] pandas/core/ self.obj, ABCSeries # type: ignore[attr-defined] pandas/core/ return self.obj # type: ignore[attr-defined] pandas/core/ return self.obj[self._selection] # type: ignore[attr-defined] pandas/core/ self.obj, ABCDataFrame # type: ignore[attr-defined] pandas/core/ return self.obj.reindex( # type: ignore[attr-defined] pandas/core/ if len(self.exclusions) > 0: # type: ignore[attr-defined] pandas/core/ return self.obj.drop(self.exclusions, axis=1) # type: ignore[attr-defined] pandas/core/ return self.obj # type: ignore[attr-defined] pandas/core/ self.obj.columns.intersection(key) # type: ignore[attr-defined] pandas/core/ set(key).difference(self.obj.columns) # type: ignore[attr-defined] pandas/core/ if key not in self.obj.columns: # type: ignore[attr-defined] pandas/core/ if key not in self.obj: # type: ignore[attr-defined] pandas/core/ return self.array.to_numpy( # type: ignore[call-arg] pandas/core/ self = cast("Categorical", self) # type: ignore[assignment] pandas/core/ return # type: ignore[union-attr] pandas/core/ values = self.astype(object)._values # type: ignore[attr-defined] pandas/core/ return self.array.memory_usage(deep=deep) # type: ignore[attr-defined] pandas/core/ return self[~duplicated] # type: ignore[index] pandas/core/ itertuple = collections.namedtuple( # type: ignore[misc] pandas/core/ def dot(self, other: Series) -> Series: # type: ignore[misc] pandas/core/ from import ( # type: ignore[no-redef] pandas/core/ from import ( # type: ignore[no-redef] pandas/core/ writer = statawriter( # type: ignore[call-arg] pandas/core/ unique_dtype.type, tuple(dtypes_set) # type: ignore[arg-type] pandas/core/ arrays.append(col) # type:ignore[arg-type] pandas/core/ def reset_index( # type: ignore[misc] pandas/core/ def sort_values( # type: ignore[override] pandas/core/ right._mgr, array_op # type: ignore[arg-type] pandas/core/ self.grouper = None # type: ignore[assignment] pandas/core/ self.loffset, # type: ignore[has-type] pandas/core/ result.index = result.index + self.loffset # type: ignore[has-type] pandas/core/ return self._downsample("std", ddof=ddof) # type: ignore[call-arg] pandas/core/ return self._downsample("var", ddof=ddof) # type: ignore[call-arg] pandas/core/ return self._downsample("quantile", q=q, **kwargs) # type: ignore[call-arg] pandas/core/ super().__init__(None) # type: ignore[call-arg] pandas/core/ and len(self.grouper.binlabels) > len(ax) # type: ignore[attr-defined] pandas/core/ ip = get_ipython() # type: ignore[name-defined] pandas/core/computation/ @property # type: ignore[misc] pandas/core/computation/ supr_new = super(Term, klass).__new__ # type: ignore[misc] pandas/core/computation/ operands = [op(env) for op in self.operands] # type: ignore[operator] pandas/core/computation/ f"Invalid function call {}" # type: ignore[attr-defined] pandas/core/computation/ "keyword error in function call " # type: ignore[attr-defined] pandas/core/computation/ return _evaluate(op, op_str, a, b) # type: ignore[misc] pandas/core/computation/ self.scope = self.scope.new_child( # type: ignore[assignment] pandas/core/computation/ self.scope = self.scope.new_child( # type: ignore[assignment] pandas/core/computation/ resolvers += tuple(local_dict.resolvers.maps) # type: ignore[has-type] pandas/core/computation/ mapping[new_key] = new_value # type: ignore[index] pandas/core/computation/ self.scope = self.scope.new_child(d) # type: ignore[assignment] pandas/core/computation/ + self.resolvers.maps # type: ignore[operator] pandas/core/computation/ + self.scope.maps # type: ignore[operator] pandas/compat/ def __new__(cls) -> Series: # type: ignore[misc] pandas/compat/ def __new__(cls) -> DataFrame: # type: ignore[misc] pandas/compat/ Unpickler(pkl._Unpickler): # type: ignore[name-defined] pandas/plotting/_matplotlib/ self.legend_handles = reversed( # type: ignore[assignment] pandas/plotting/_matplotlib/ self.legend_labels = reversed( # type: ignore[assignment] pandas/plotting/_matplotlib/ args = (x, y) # type: ignore[assignment] pandas/plotting/_matplotlib/ plotf = self._plot # type: ignore[assignment] pandas/plotting/_matplotlib/ blabels = None # type: ignore[assignment] pandas/plotting/_matplotlib/ series.index = series.index.asfreq( # type: ignore[attr-defined] pandas/plotting/_matplotlib/ weekdays = np.unique(index.dayofweek) # type: ignore[attr-defined] pandas/plotting/ self.__init__() # type: ignore[misc] pandas/io/formats/ gen2 = ( # type: ignore[assignment] pandas/io/formats/ writer = ExcelWriter( # type: ignore[abstract] pandas/io/formats/ return head, tail # type: ignore[return-value] pandas/io/formats/ attrs.append(("dtype", f"'{obj.dtype}'")) # type: ignore[attr-defined] pandas/io/formats/ attrs.append(("name", default_pprint( # type: ignore[attr-defined] pandas/io/formats/ obj.names # type: ignore[attr-defined] pandas/io/formats/ attrs.append(("names", default_pprint(obj.names))) # type: ignore[attr-defined] pandas/io/formats/ return __IPYTHON__ or check_main() # type: ignore[name-defined] pandas/io/formats/ ip = get_ipython() # type: ignore[name-defined] pandas/io/formats/ class MyStyler(cls): # type:ignore[valid-type,misc] pandas/io/formats/ handles.handle, # type: ignore[arg-type] pandas/io/formats/ handles.handle.write(xml_doc) # type: ignore[arg-type] pandas/io/formats/ float_format(value=v) # type: ignore[operator,call-arg] pandas/io/ itemsize = dtype.itemsize # type: ignore[attr-defined] pandas/io/ return dict(d1 + d2 + d3) # type: ignore[operator] pandas/io/ blocks = list(mgr.blocks) # type: ignore[union-attr] pandas/io/ blocks.extend(mgr.blocks) # type: ignore[union-attr] pandas/io/ pickle.dumps(obj, protocol=protocol) # type: ignore[arg-type] pandas/io/ obj, handles.handle, protocol=protocol # type: ignore[arg-type] pandas/io/ return pickle.load(handles.handle) # type: ignore[arg-type] pandas/io/excel/ for extension in cls.supported_extensions # type: ignore[attr-defined] pandas/io/excel/ zf = zipfile.ZipFile(stream) # type: ignore[arg-type] pandas/io/ self.TYPE_MAP = list(range(251)) + list("bhlfd") # type: ignore[arg-type] pandas/io/ self.path_or_buf = BytesIO( # type: ignore[arg-type] pandas/io/ to_write.encode(self._encoding) # type: ignore[arg-type] pandas/io/ self.handles.handle.write(value) # type: ignore[arg-type] pandas/io/ self.handles.handle.write(bio.getvalue()) # type: ignore[arg-type] pandas/io/json/ ) # type:ignore[misc] pandas/io/ ) # type: ignore[operator] pandas/io/ fileobj=handle, # type: ignore[arg-type] pandas/io/ handle, # type: ignore[arg-type] pandas/io/ handle, # type: ignore[arg-type] pandas/io/ _BytesZipFile(zipfile.ZipFile, BytesIO): # type: ignore[misc] pandas/io/ super().__init__(file, mode, **kwargs_zip) # type: ignore[arg-type] pandas/io/ wrapped = cast(mmap.mmap, _MMapWrapper(handle)) # type: ignore[arg-type] pandas/io/parsers/ self.handles.handle = self.handles.handle.mmap # type: ignore[union-attr] pandas/io/parsers/ return mapping[engine](self.f, **self.options) # type: ignore[call-arg] pandas/io/parsers/ = reader # type: ignore[assignment] pandas/io/parsers/ size = self.chunksize # type: ignore[attr-defined] ```
Praveenk8051 commented 3 years ago

Hello, I'm contributing for the first time to OSS. Can i try this one ?

simonjayhawkins commented 3 years ago

@Praveenk8051 some fixes will be easier than others. anyone is welcome to take a look at addressing any of these mypy errors.

Praveenk8051 commented 3 years ago

Thank you. I will go through the development documentation and then jump into the issue

sidram05 commented 3 years ago

Hi @simonjayhawkins / @Praveenk8051 do you guys need more help ? One of my first OSS projects as well, willing to help!

simonjayhawkins commented 3 years ago

@sidram05 sure.

some fixes will be easier than others. anyone is welcome to take a look at addressing any of these mypy errors.

Praveenk8051 commented 3 years ago

@sidram05 I'm not working on this. I'm working on different issue

Praveenk8051 commented 3 years ago

@sidram05 Are you working on this ?

phofl commented 3 years ago
 pandas/io/ or i - len(self.index_col) # type: ignore[operator]
pandas/io/ if i in self._col_indices # type: ignore[operator]

Would be fixed with #38334

jotasi commented 3 years ago

pandas/io/ was already adapted in #37639. That fixed / removed the following:

(Most (i.e. all [arg-type]s related to next(...)) are fixed by asserting that the Optional that next is called on is not None.)

In turn, it introduced:

I'll try to put together a PR fixing some of the remaining in there.

phofl commented 3 years ago is now fixed

jreback commented 3 years ago

updated checkboxes for,,

deepakdinesh1123 commented 3 years ago

Hi first time contributing, I tried to look at the mypy errors that were being shown and most of them were showing that the stubs for that particular module does not exist. How do i proceed in contributing to this issue?

phofl commented 3 years ago

pandas/io/ for argname, default in _fwf_defaults.items(): # type: ignore[assignment] pandas/io/ counts = defaultdict(int) # type: ignore[var-annotated] pandas/io/ index = index.set_names(indexnamerow[:coffset]) # type: ignore[union-attr] pandas/io/ col_name = self.index_names[i] # type: ignore[index] pandas/io/ self.index_names # type: ignore[arg-type] pandas/io/ usecols = None # type: ignore[assignment] pandas/io/ columns = list(self.orig_names) # type: ignore[arg-type] pandas/io/ and col not in self.orig_names # type: ignore[operator] pandas/io/ col = self.orig_names[col] # type: ignore[index] pandas/io/ and col not in self.orig_names # type: ignore[operator] pandas/io/ col = self.orig_names[col] # type: ignore[index] pandas/io/ unnamed_cols = set() # type: ignore[var-annotated] pandas/io/ columns = [] # type: ignore[var-annotated] pandas/io/ counts = defaultdict(int) # type: ignore[var-annotated] pandas/io/ this_columns = [None] * lc # type: ignore[list-item] pandas/io/ columns.append(this_columns) # type: ignore[arg-type] pandas/io/ f"{self.prefix}{i}" # type: ignore[misc] pandas/io/ columns = [list(range(ncols))] # type: ignore[arg-type] pandas/io/ na_fvalues = set() # type: ignore[var-annotated] pandas/io/ na_fvalues = { # type: ignore[assignment] pandas/io/ index_names[i] = None # type: ignore[call-overload] pandas/io/ result.append(v) # type: ignore[arg-type] pandas/io/ result.append(int(x)) # type: ignore[arg-type]

These were solved (e.g. #39342) cc @simonjayhawkins

Varun270 commented 3 years ago

Hello this my first OSC can anyone tell me whether this issue is still open or not.

nickleus27 commented 2 years ago

I would like to contribute to this issue. Should I let people know what file and line number I am working on? Is there any that are easier than others? I will start reading contribution docs.

simonjayhawkins commented 2 years ago

Thanks @nickleus27. adding a xref back to this issue on opening a PR should be enough to let people looking at contributing to this issue know the status.

simonjayhawkins commented 2 years ago

grep -Ern "type: ?ignore" pandas/ currently gives 207 matches

as of today, this is now 553 matches.

simonjayhawkins commented 2 years ago

Is there any that are easier than others?

could perhaps start with grep -Ern "type: ?ignore\[assignment\]" pandas/. We have allow_redefinition = false in our mypy config. So fixing these is sometimes just a matter of using a different variable name.

nickleus27 commented 2 years ago

@simonjayhawkins great thank you

phofl commented 2 years ago

You can simply add a xref when opening your pr

seanjedi commented 1 year ago

Is this issue resolved? Can I also work on this issue?

Snehaaa18 commented 1 year ago

Is this issue open? I would like to work on it

ggold7046 commented 1 year ago

Hi, I want to work on this issue. Can anybody guide me what to do to fix mypy error ?

MarcoGorelli commented 1 year ago

shall we close this? the most obvious ones have been done already, it's generally not clear to people how to get started, and fixing the remaining type ignores requires a fair bit of knowledge

mroeschke commented 10 months ago

Agreed. Closing