pandas-dev / pandas-stubs

Public type stubs for pandas
BSD 3-Clause "New" or "Revised" License
234 stars 124 forks source link

BUG: wrong type hinting - DataFrame apply method parameter func in result_type='expand' mode #486

Closed ofiryaish closed 1 year ago

ofiryaish commented 1 year ago

Pandas version checks

Reproducible Example

>>> df.apply(lambda x: [1, 2], axis=1, result_type='expand')
   0  1
0  1  2
1  1  2
2  1  2

Issue Description

Hello, I saw that the type hinting for apply is wrong for result_type='expand' mode. In this mode, func can be a function that may return list-like results, but the type hint is just scalar. This is annoying as PyLance suggests that there is a wrong usage due to the type hint mistake.

Expected Behavior

This is not a real bug, just something annoying that should be fixed.

Installed Versions

1.5.2

twoertwein commented 1 year ago

Moved to pandas-stubs: pandas has no return annotation for apply https://github.com/pandas-dev/pandas/blob/797f23efcf9c5eeeed06749493aa0a5c5baf5514/pandas/core/frame.py#L9169

ofiryaish commented 1 year ago

@twoertwein

Yes, I saw it, but I am talking about the parameter func, which should have a return value of Scalar in result_type='expand' mode.

image

gandhis1 commented 1 year ago

DataFrame.apply() has numerous overloads, in your screenshot Pylance appears to be only showing a subset of them. It's not clear to me there is an issue with pandas-stubs here, perhaps this is an issue with Pylance and how they display functions with many overloads.

That said, I still cannot verify what you are seeing. This is what I see in VS code:

(method)
apply(f: (...) -> (ListLikeExceptSeriesAndStr@apply | Series[Unknown]), axis: AxisTypeIndex = ..., raw: _bool = ..., result_type: None = ..., args: ... = ..., **kwargs: Unknown) -> DataFrame

apply(f: (...) -> S1@apply, axis: AxisTypeIndex = ..., raw: _bool = ..., result_type: None = ..., args: ... = ..., **kwargs: Unknown) -> Series[S1@apply]

apply(f: (...) -> Mapping[Unknown, Unknown], axis: AxisTypeIndex = ..., raw: _bool = ..., result_type: None = ..., args: ... = ..., **kwargs: Unknown) -> Series[Unknown]

apply(f: (...) -> S1@apply, axis: AxisType = ..., raw: _bool = ..., args: ... = ..., *, result_type: Literal['expand', 'reduce'], **kwargs: Unknown) -> Series[S1@apply]

apply(f: (...) -> (ListLikeExceptSeriesAndStr@apply | Series[Unknown] | Mapping[Unknown, Unknown]), axis: AxisType = ..., raw: _bool = ..., args: ... = ..., *, result_type: Literal['expand'], **kwargs: Unknown) -> DataFrame

apply(f: (...) -> (ListLikeExceptSeriesAndStr@apply | Mapping[Unknown, Unknown]), axis: AxisType = ..., raw: _bool = ..., args: ... = ..., *, result_type: Literal['reduce'], **kwargs: Unknown) -> Series[Unknown]

apply(f: (...) -> (Scalar | ListLikeExceptSeriesAndStr@apply | Series[Unknown] | Mapping[Unknown, Unknown]), axis: AxisType = ..., raw: _bool = ..., args: ... = ..., *, result_type: Literal['broadcast'], **kwargs: Unknown) -> DataFrame

apply(f: (...) -> Series[Unknown], axis: AxisTypeIndex = ..., raw: _bool = ..., args: ... = ..., *, result_type: Literal['reduce'], **kwargs: Unknown) -> Series[Unknown]

apply(f: (...) -> S1@apply, raw: _bool = ..., result_type: None = ..., args: ... = ..., *, axis: AxisTypeColumn, **kwargs: Unknown) -> Series[S1@apply]

apply(f: (...) -> (ListLikeExceptSeriesAndStr@apply | Mapping[Unknown, Unknown]), raw: _bool = ..., result_type: None = ..., args: ... = ..., *, axis: AxisTypeColumn, **kwargs: Unknown) -> Series[Unknown]

apply(f: (...) -> Series[Unknown], raw: _bool = ..., result_type: None = ..., args: ... = ..., *, axis: AxisTypeColumn, **kwargs: Unknown) -> DataFrame

apply(f: (...) -> Series[Unknown], raw: _bool = ..., args: ... = ..., *, axis: AxisTypeColumn, result_type: Literal['reduce'], **kwargs: Unknown) -> DataFrame

It's rather incomprehensible, but nothing here appears to be wrong, per se.

Yes, I saw it, but I am talking about the parameter func, which should have a return value of Scalar in result_type='expand' mode.

I'm not following here. If you are using a result_type of "expand" then you are likely returning a list-like (or a dict), not a scalar.

ofiryaish commented 1 year ago

I'm using result_type=expand and my func returns a list-like and not a scalar. However, Pylance says that there is a problem since the func should be a function that returns Scalar.

It indeed might be a Pylance bug.

Again, in run time, I don't get errors.

Dr-Irv commented 1 year ago

@ofiryaish This might be a version problem. Can you do the following:

  1. Indicate the version of pylance you are using within VS Code.
  2. Create a new python program, and do the following:
    from pandas._version import _stub_version
    reveal_type(_stub_version)

    In the "Problems" tab, you will see the value of _stub_version

I should note that I did the following and got no complaints within VS Code, pylance version v2022.12.21 which has stubs version 1.5.2.221124

import pandas as pd
df = pd.DataFrame({"x":[1,2,3]})
result = df.apply(lambda x: [1, 2], axis=1, result_type='expand')

Note that pylance includes pandas-stubs, but their releases may not include the latest version of pandas-stubs.

Dr-Irv commented 1 year ago

@ofiryaish Please respond to the request above, or we will close this issue as "cannot reproduce". Deadline of 1/18/2023

ofiryaish commented 1 year ago
from pandas._version import _stub_version
reveal_type(_stub_version)

Sorry for the late response. I did not notice your message.

I do not get the warning anymore. Maybe I updated the version of pylance and it fixed it.

Thank You