Open Hari-Shankar-Karthik opened 3 days ago
Thanks for the suggestion! It appears there was an effort to allow string bins in pd.cut
in #23567 but that PR got stale.
PRs are welcomed to add string bins support, dispatching the string to np.histogram_bin_edges
.
Hey, would like to work on this.
Take
Feature Type
[ ] Adding new functionality to pandas
[X] Changing existing functionality in pandas
[ ] Removing existing functionality in pandas
Problem Description
While converting a quantitative variable into a qualitative one,
pd.cut()
comes in clutch. However, it requires the user to specifybins
as either an integer or a list of bin edges. I wish it was allowed to specifybins='auto'
similar to hownp.histogram
allows it. It internally leveragesnp.histogram_bin_edges
to compute these. Thank you.Expectation
Instead of coding
pd.cut(df['x1'], bins=np.histogram_bin_edges(df['x1'], bins='auto'))
Allow for codingpd.cut(df['x1'], bins='auto')
Additional Context
Calculation of bin edges is already done via
np.histogram_bin_edges
. Reference: https://numpy.org/doc/stable/reference/generated/numpy.histogram_bin_edges.html#numpy-histogram-bin-edges