pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.83k stars 18k forks source link

ENH: value_counts to produce both count and normalized #60385

Open Keramatfar opened 1 day ago

Keramatfar commented 1 day ago

Feature Type

Problem Description

I would like pandas to have a feature for when I need to see both the count and relative counts of a Series at once.

Feature Description

Now, I can get the count by value_counts and the relative count by passing the normalize parameter to that function. Sometimes it is more interesting to see them at once, probably using a new parameter to this function. Maybe editing the normalize parameter to handle three states, raw, relative, or both.

Alternative Solutions

Using two consecutive calls to value_counts by different values for normalize could provide the functionality.

Additional Context

No response

rhshadrach commented 12 hours ago

Thanks for the request. I think two calls to value_counts would be unnecessary, and less performant than:

ser = pd.Series([1, 1, 2, 1, 3, 3, 1])

result = ser.value_counts().to_frame()
result["normalized"] = result / result["count"].sum()
print(result)
#    count  normalized
# 1      4    0.571429
# 3      2    0.285714
# 2      1    0.142857

This seems readily doable via the existing API, and therefore I am negative on adding it.