iiasa / climate-assessment

https://climate-assessment.readthedocs.io/en/latest
MIT License
19 stars 18 forks source link

Bump pyam-iamc to >= 1.9.0 #36

Closed phackstock closed 1 year ago

phackstock commented 1 year ago

closes #27.

phackstock commented 1 year ago

Looks like there's four tests that are breaking. If any of you @znicholls, @lewisjared or @jkikstra could have a quick look that would be great.

znicholls commented 1 year ago

I think I worked it out. Put the code below after line 360 of checks.py and you'll see. I think the original regression is in pandas, but you can work around it by tweaking pyam

        if model == "model7":
            import pdb
            pdb.set_trace()
            df_scen.timeseries()  # No CO2|Energy and Industrial as expected
            df_scen._data.index.get_level_values("variable").unique()  # No CO2|Energy and Industrial as expected

            # Using pyam accessors gives wrong result so `has_co2_energy`
            # evaluates incorrectly above
            df_scen.variable  # Contains CO2|Energy and Industrial ?!

            # What pyam uses internally. I think pyam is correct, but pandas
            # has a crazy regression where this returns incorrect information?!
            list(df_scen._data.index.levels[df_scen._data.index._get_level_number("variable")])  # Contains CO2|Energy and Industrial ?!
            # I assume that accessing levels directly on a MultiIndex is somehow not the intended pattern

            # I'd suggest updating pyam's internal's to use this, which seems to work
            list(df_scen._data.index.get_level_values("variable").unique())  # No CO2|Energy and Industrial as expected
phackstock commented 1 year ago

Thanks @znicholls, that's super weird. I've implemented your suggestion. Is this a pyam bug? Could it be related to a variable being dropped and df.variable not being updated accordingly?

phackstock commented 1 year ago

Option b is you do a bugfix release in pyam (fixing the pandas bug), do a new pyam release then change this MR and you don't have to do any fix at all here.

So you think it's a pandas bug? Not sure what's the best option here, if you can reproduce it, maybe open a pyam issue and we discuss over there? It's not super critical to update pyam for climate-assessment. We could do #34 now then and just increase the pyam version later.

danielhuppmann commented 1 year ago

So I"m trying to get to the bottom of this, but I can't replicate the observation by @znicholls about df.variable returning something else than the correct values (with pyam 1.9 and pandas 1.5.3)... It seems all correct to me.

danielhuppmann commented 1 year ago

So I ran the tests along the entire history of pyam and it seems that https://github.com/IAMconsortium/pyam/pull/731 breaks the test.

znicholls commented 1 year ago

pandas 1.5.3

I think I had pandas 2 maybe (memory fading now)

I ran the tests along the entire history of pyam

Nice detective work, did you use git bisect?

znicholls commented 1 year ago

(I've made a bit of a mess of this, sorry)

danielhuppmann commented 1 year ago

Implement a fix based on @znicholls' observation in https://github.com/IAMconsortium/pyam/issues/762

phackstock commented 1 year ago

After the pyam update, I wanted to update this PR to verify that that fixes the issue. However, as I just saw the tests were already passing yesterday? I'm a bit confused now but I guess it's all good?

danielhuppmann commented 1 year ago

See https://github.com/iiasa/climate-assessment/pull/36/commits/ccf0e7bba61f2526665ff196c62f442b0cb2fd19 - the problem was caused by the very inefficient groupby, so I refactored that piece of the code.

phackstock commented 1 year ago

Since the tests are passing and I'd love to get this out before the weekend, I'll move on with the merge. Hope that's ok @znicholls.

znicholls commented 1 year ago

Hope that's ok @znicholls.

If all passing, all good (particularly given how much we've talked about this already)