IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
221 stars 115 forks source link

aggregate_time() issue when uodating to pyam 2.0 #785

Closed adrivinca closed 8 months ago

adrivinca commented 8 months ago

Hi, I realized updating from pyam 1.7 to 2.0 a piece of code I used in sub-annual reporting fails

rep_iamSE_sub.aggregate_time(vv, column="subannual", value="year", append=True)

gives the following error, although the input iamDF has values. Also, the code of the function aggregate_time does not seem to have changed, so I wonder what I could do to use the function. Any suggestions on how to solve this is very welcome. Since we instructed people in a workshop to use these functions. maybe pandas pivot_table has changed behaviour?

for vv in rep_iamSE_sub.variable:
    rep_iamSE_sub.aggregate_time(vv, column="subannual", value="year", append=True)
Traceback (most recent call last):

  Cell In[26], line 2
    rep_iamSE_sub.aggregate_time(vv, column="subannual", value="year", append=True)

  File ~\AppData\Local\anaconda3\envs\message_env\Lib\site-packages\pyam\core.py:1653 in aggregate_time
    _df = _aggregate_time(
![pyamaggregation py, line 185](https://github.com/IAMconsortium/pyam/assets/22025365/33ec2b39-6a44-4826-a138-3f14d3cce179)

  File ~\AppData\Local\anaconda3\envs\message_env\Lib\site-packages\pyam\aggregation.py:186 in _aggregate_time
    .value.rename_axis(None, axis=1)

  File ~\AppData\Local\anaconda3\envs\message_env\Lib\site-packages\pandas\core\generic.py:6202 in __getattr__
    return object.__getattribute
![rep_iamSE_sub timeseries() head()](https://github.com/IAMconsortium/pyam/assets/22025365/31927fc7-255c-461a-906f-2ba2ccc63049)
__(self, name)

AttributeError: 'DataFrame' object has no attribute 'value'

I attach a spyder screenshot from rep_iamSE_sub.timeseries().head() and an excel screenshot of the df after the pivot table, before running into the error (pyam/aggregation.py, line 185) rep_iamSE_sub timeseries() head() pyamaggregation py, line 185

I will possibly attach a simple example if I have time, but would be great to get some insights in the meanwhile. Thanks

danielhuppmann commented 8 months ago

So I looked at the test for aggregate_time() and played around a bit, but I did not manage to replicate your error.

Please make sure that you have the latest version of pandas and other dependencies, and try to run the following in a notebook:

TEST_YEARS = [2005, 2010]

DATA = pd.DataFrame(
    [
        ["World", "Primary Energy", "EJ/yr", 12, 15],
        ["reg_a", "Primary Energy", "EJ/yr", 8, 9],
        ["reg_b", "Primary Energy", "EJ/yr", 4, 6],
    ],
    columns=["region", "variable", "unit"] + TEST_YEARS,
)

_df = DATA.copy()

def add_subannual(_data, name, value):
    _data["subannual"] = name
    _data[TEST_YEARS] = _data[TEST_YEARS] * value
    return _data

mapping = [("year", 1), ("summar", 0.7), ("winter", 0.3)]
lst = [add_subannual(_df.copy(), name, value) for name, value in mapping]

df = pyam.IamDataFrame(model="model_a", scenario="scen_a", data=pd.concat(lst))

df.aggregate_time("Primary Energy")

If this also fails, the problem is on your local installation. If my example works and the larger MESSAGE-workflow still fails, please provide a small dataset to illustrate the behavior.

adrivinca commented 8 months ago

Thanks @danielhuppmann this work and also in my more complicated case, aggregate_time() works with each single variable. However, in my case I have a for loop with append and as soon as the values of the first variable gets aggregated, the next aggregation phase fails (like if the variableinput was not considered) in previous version this used to work. I replicate the issues expanding your examples, which should break

TEST_YEARS = [2005, 2010]

DATA = pd.DataFrame(
    [
        ["World", "Primary Energy", "EJ/yr", 12, 15],
        ["reg_a", "Primary Energy", "EJ/yr", 8, 9],
        ["reg_b", "Primary Energy", "EJ/yr", 4, 6],
        ["reg_a", "Secondary Energy", "EJ/yr", 18, 9],
        ["reg_b", "Secondary Energy", "EJ/yr", 14, 6],
    ],
    columns=["region", "variable", "unit"] + TEST_YEARS,
)

_df = DATA.copy()

def add_subannual(_data, name, value):
    _data["subannual"] = name
    _data[TEST_YEARS] = _data[TEST_YEARS] * value
    return _data

mapping = [("year", 1), ("summar", 0.7), ("winter", 0.3)]
lst = [add_subannual(_df.copy(), name, value) for name, value in mapping]
# remove values with year, otherwise append in aggregate_time gives error
dt = pd.concat(lst)
dt = dt[dt.subannual != "year"]

df = pyam.IamDataFrame(model="model_a", scenario="scen_a", data=dt)

for vv in df.variable:
    print(vv)
    df.aggregate_time("Primary Energy", column="subannual", value="year", append=True)

Of course I can adapt my script and simply append the aggregated values separately. But it is good to be aware of the different behaviour. Thanks

danielhuppmann commented 8 months ago

closed by #790