Open attack68 opened 2 years ago
Seems that in this circumstance (in list argument to pandas.DataFrame.aggregate), pandas first tries to apply the aggregating function to each data point, and from the moment this fails, falls back to the correct behaviour (calling back with the Series to be aggregated).
The solution is to force Series arguments:
def mean2(s:Series):
if not isinstance(s,Series):
raise ValueError('need Series argument')
try:
ret = s.mean()
except Exception:
ret = pd.NA
return ret
df.agg([mean2, "sum"])
A B C
mean2 1500.0 1500.0 <NA>
sum 3000.0 3000.0 ab
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
If the following
agg
is performed it currently works but gives a warning:FutureWarning: ['C'] did not aggregate successfully. If any error is raised this will raise in a future version of pandas. Drop these columns/ops to avoid this warning. print(df.agg(["mean", "sum"]))
However, I do not want to:
'C'
, because there is at least one op,sum
, which produces a valid results for that column.'mean'
because there is at least one column,'A', 'B'
, which produce valid results for that op.I tried to design a function which would error trap this:
Oddly, this works with
apply
which is what theagg
docs give guidance on:So what's the solution here?
Expected Behavior
.
Installed Versions
.