pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.42k stars 17.85k forks source link

BUG: .mode(dropna=False) doesn't work with nullable integers #58926

Open theemathas opened 4 months ago

theemathas commented 4 months ago

Pandas version checks

Reproducible Example

import pandas as pd
series = pd.Series([1, 1, 2, 3]).astype('Int64')
print(series.mode(dropna=False))

Issue Description

This code causes the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/REDACTED/lib/python3.12/site-packages/pandas/core/series.py", line 2333, in mode
    res_values = values._mode(dropna=dropna)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/REDACTED/lib/python3.12/site-packages/pandas/core/arrays/masked.py", line 1112, in _mode
    result, res_mask = mode(self._data, dropna=dropna, mask=self._mask)
    ^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 2)

Expected Behavior

The code should print the modes of the input series.

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.12.3.final.0 python-bits : 64 OS : Darwin OS-release : 23.4.0 Version : Darwin Kernel Version 23.4.0: Fri Mar 15 00:10:42 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.2.2 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : None pip : 24.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.25.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : 3.1.3 pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None
Aloqeely commented 4 months ago

Thanks for the report! I can reproduce this on main, PRs to fix are welcome!