Open willu47 opened 1 year ago
If I add recursive=True
argument to the aggregate()
function I get this:
model scenario region variable unit year value
0 model_a scen_a World Primary Energy|Fossil EJ/yr 2010 85.0
1 model_a scen_a World Primary Energy|Fossil|Coal EJ/yr 2010 40.0
(
IamDataFrame(PRICE_NESTED_DF)
.aggregate(variable='Primary Energy|Fossil', recursive=True)
.aggregate(variable='Primary Energy|Fossil')
)
returns
model scenario region variable unit year value
0 model_a scen_a World Primary Energy|Fossil EJ/yr 2010 40.0
Thanks @willu47 - indeed, I'd say that this is behaving as expected.
aggregate()
is called without further arguments, it will uses all variables that are directly below variable
(equivalent to filter(variable=f"{variable}|*"), level=0
, see this utility method)Question back to you: which other behavior would you find more intuitive? Or how could we improve the docs?
Sidenote:
df.aggregate("<variable>", append=True)
has the same behavior as
df.append(df.aggregate("<variable>"))
but the first option has better performance.
And FYI: pyam has a testing module with a function pyam.testing.assert_iamframe_equal
, see the docs - this is maybe more appropriate for your use case because you don't have to worry about the order of the columns and rows (and it operates on an indexed pd.Series, so it's faster).
Hi, following a question I asked in the openmod session today, please could you confirm the expected behaviour of the
.aggregate
function when presented with missing levels in the data hierarchy. For example, the following test fails because the two coal sub-categoriesPrimary Energy|Fossil|Coal|Lignite
andPrimary Energy|Fossil|Coal|Brown
are ignored.Am I missing something?