IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
235 stars 120 forks source link

Compute quantiles with groupby #757

Open kvanderwijst opened 1 year ago

kvanderwijst commented 1 year ago

It might be nice to be able to compute quantiles after a groupby command, maybe in a syntax similar to this:

(
    df
    .filter(variable="Emissions|CO2")
    .compute.groupby("category")
    .quantiles([0.05, 0.95])
    .plot(color="category", fill_between=True)
)
danielhuppmann commented 1 year ago

I like the idea! The main question is how to name the model/scenario/variable/region values of the computed data such that the returned object can be again an IamDataFrame.

znicholls commented 1 year ago

This is how we solved this in scmdata (although ignore the type hints, they're broken): https://github.com/openscm/scmdata/blob/30b8ce9037af634551c9199f411fe5743b8e2e63/src/scmdata/run.py#L1628

We let the user specify the outputs for e.g. model/scenario with the op_col variable and say whether they want to try casting back to an ScmRun object with the as_run variable. You don't have to do it that way of course, you could also always try to cast back to IamDataFrame and use something like op_col to let users pass in the extra data needed to make that conversion valid