Closed znicholls closed 1 year ago
Oh my got, this is such a great find!
So... The issue is that when you do a pandas-groupby, pandas does not actually remove unused levels from df_scen
.
If you continue with the example above:
df_scen.index.levels[3]
> Index(['Primary Energy', 'Primary Energy|Coal'], dtype='object', name='variable')
which is exactly what the pyam-accessors use.
It's not really a bug in pandas, I guess, more a performance-optimizing feature - only drop unused levels if necessary.
Also, this explains why #731 introduced this behavior - it removed the (performance-drag of) resetting the index twice, which probably inadvertently removed unused levels...
Very nice explanation, will review #763 now
The test below fails and I can't see why. This is the underlying cause of https://github.com/iiasa/climate-assessment/pull/36 I think.
In short: if you subset an
IamDataFrame
and then create new instances within some loop (probably a bad pattern, but let's ignore that for now), the metadata is still based on the original data rather than the subset. In the example, below this means that it looks like a scenario provides a variable when it actually doesn't.@phackstock cc @danielhuppmann in case it helps with your search for a cause and making sure the bug doesn't come back.