Open erfannariman opened 4 years ago
Yes please. I would also be interested in this feature. I posted a feature request a few months ago but good to see I am not alone. #28190
If I am not mistaken, it seems it may be easier to implement now with the named aggregates functionality too.
take
Is it out now?
@SpectrumWings are you still working on this? Else I would like to give it a go.
take
Hi all. Just wanted to say I would love to see this feature developed. It's a routine very commonly needed in scientific data analysis. dplyr &c support it; would be fantastic to see in pandas.
Hi there, is there any update on when we can expect this feature?
@SanderLam pandas is all volunteer
features happen when the community does pull requests - you are welcome to do that
core can provide review
I'm very interested in this feature as well
Looking forward for this one
Very interested in this, too. I keep getting bummed out that pandas isn't quite as elegant as R when it comes to groupby > aggregate logic, but this would be a great addition!
Interestingly, Polars organically does that! So if this is super needed, you can import the DF to Polars and do that. I genuinely believe that Pandas should adapt that as well.
take
@samukweku it would be something like: import polars as pl df.group_by("col0").agg( sum_all_under_200 = pl.col('col1').filter(pl.col('col2') > 200).sum() )
Hello,
It would be something like: import polars as pl df.group_by("col0").agg( sum_all_under_200 = pl.col('col1').filter(pl.col('col2') > 200).sum() )
From: Samuel Oranyeli @.> Date: Sunday, June 30, 2024 at 5:24 AM To: pandas-dev/pandas @.> Cc: Tawfik @.>, Mention @.> Subject: Re: [pandas-dev/pandas] Named aggregations with multiple columns (#29268)
@tawfikharounhttps://github.com/tawfikharoun can you share an example of how polars does this? Thanks
— Reply to this email directly, view it on GitHubhttps://github.com/pandas-dev/pandas/issues/29268#issuecomment-2198395060, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APTIJ3ALCL6P6N4733XDDNDZJ5M6TAVCNFSM4JGKBAY2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJZHAZTSNJQGYYA. You are receiving this because you were mentioned.Message ID: @.***>
Since
pandas 0.25.0
we have named aggregations.Which works fine if you do aggregations on single columns. But what if you want to apply aggregations over multiple columns:
example:
But what if we want to calculate the
a.max() - b.max()
while aggregating. That does not seem to work. For example, something like this would make sense:So is it possible to do named aggregations on multiple columns? If not, is this in the pipeline for future releases?