Open warwickmm opened 3 months ago
I would like to work on this
Thanks for the report - it seems to me comparing None
to e.g. integers should raise. My guess is that x > y
succeeding is a result of assuming None
is an NA value and hence behaves like np.nan
(always false for comparisons). Further investigations are welcome!
take
@rhshadrach - Any ideas for a fix? do we raise an error when "<" is used between Series that contains None?
That seems like the correct behavior to me - yes.
Should DataFrame.gt
raise an error as well?
Also, should one expect the behavior to be consistent across all values for which pd.isna
returns True
(e.g., None
, np.nan
, pd.NA
, etc.)? Or does one need to be cognizant of how missing values are represented in each instance?
My above comments are only regarding Python's None
when stored in an object-dtype column or Series.
Thanks. I'll just note that the below also currently runs without error. Not sure if that's a situation that needs to be considered as well.
>>> x = pd.Series([None], dtype=object)
>>> x.gt(0)
0 False
dtype: bool
Hi @warwickmm! Are you working on this? If not, I would like to take this up.
I am not.
take
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When a
Series
hasdtype=object
, comparison methods (e.g.,.gt
) can raise aTypeError: '>' not supported
error. No error is encountered when using the>
operator, or when callingDataFrame.gt
, or when theSeries
hasdtype=float
.Expected Behavior
When the
Series
hasdtype=object
, the behavior ofSeries.gt
should be consistent with the>
operator and with theDataFrame.gt
method.Installed Versions