Closed znicholls closed 10 months ago
@phackstock I got some of the way but then crashed into https://github.com/IAMconsortium/pyam/issues/793 which I think is the reason that the remaining tests are failing (because it causes a regression in the behaviour of reclassify_waste_and_other_co2_ar6
)
@jkikstra, if you could take this one that would be great. I'll need to update our Scenario Explorer processing container soon and that will require pandas 2.0 which currently breaks those tests.
Thanks for the initial work and figuring out the issue! Will try to tackle this at the end of this week.
Perfect, thanks a lot @jkikstra!
Started working on it now, but stopping for today.
Starting a list here for changes that I think we need to be making:
reclassify_waste_and_other_co2_ar6()
(in checks.py), perhaps it is due to the changes for exclude
? https://github.com/IAMconsortium/pyam/pull/759, meaning df.extra_cols is not the same for the two dataframes on line 843@danielhuppmann I'm getting this:
def reclassify_waste_and_other_co2_ar6(df):
"""
Reclassify waste and other CO2 into the energy and industrial processes category
Reclassify CO2 emissions reported under Emissions|CO2|Other and
Emissions|CO2|Waste, instead putting them under
Emissions|CO2|Energy and Industrial Processes.
Parameters
----------
df : :class:`pyam.IamDataFrame`
The original set of reported emissions
Returns
-------
:class:`pyam.IamDataFrame`
Reclassified set of emissions.
"""
# filter out the scenarios that do not need changes
df_nochange = df.copy()
df_nochange.require_data(
variable=["Emissions|CO2|Other", "Emissions|CO2|Waste"], exclude_on_fail=True
)
df_nochange.filter(exclude=True, inplace=True)
df_nochange.reset_exclude()
# select the scenarios that do need changes
df_change = df.copy()
df_change.require_data(
variable=["Emissions|CO2|Other", "Emissions|CO2|Waste"], exclude_on_fail=True
)
df_change.filter(exclude=False, inplace=True)
df_change.reset_exclude()
if df_change.empty:
return df_nochange
# rename old CO2|Energy and Industrial Processes, to be replaced
df_change.rename(
variable={
"Emissions|CO2|Energy and Industrial Processes": "Emissions|CO2|Energy and Industrial Processes|Incomplete"
},
inplace=True,
)
# use pandas to create new CO2|Energy and Industrial Processes by adding CO2|Other and CO2|Waste
df_change_pd = df_change.as_pandas()
varsum = [
"Emissions|CO2|Waste",
"Emissions|CO2|Other",
"Emissions|CO2|Energy and Industrial Processes|Incomplete",
]
df_change_notaffected_pd = df_change_pd[~df_change_pd.variable.isin(varsum)]
df_change_notaffected_pyam = pyam.IamDataFrame(df_change_notaffected_pd)
df_change_pd = df_change_pd[df_change_pd.variable.isin(varsum)]
df_change_pd = df_change_pd.groupby(
by=["model", "scenario", "year"], as_index=False
)
df_change_pd = df_change_pd.sum()
df_change_pd["variable"] = "Emissions|CO2|Energy and Industrial Processes"
df_change_pd["unit"] = "Mt CO2/yr"
df_change_pd["region"] = "World"
df_change_pyam = pyam.IamDataFrame(df_change_pd)
df_change = pyam.concat([df_change_pyam, df_change_notaffected_pyam])
# recombine dataframes
df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)
return df_new
test results:
(ca-deps-311) C:\Users\kikstra\Documents\GitHub\climate-assessment>pytest tests/integration/test_units.py
======================================================= test session starts =======================================================
platform win32 -- Python 3.10.2, pytest-7.2.1, pluggy-1.0.0
rootdir: C:\Users\kikstra\Documents\GitHub\climate-assessment, configfile: setup.cfg
plugins: anyio-3.7.1
collected 23 items
tests\integration\test_units.py .....................FF [100%]
============================================================ FAILURES =============================================================
_____________________________________________________ test_reclassify_co2_ar6 _____________________________________________________
def test_reclassify_co2_ar6():
input_emissions_file = os.path.join(TEST_DATA_DIR, "ex2.csv")
processed_input_emissions_file = os.path.join(
TEST_DATA_DIR, "ex2_adjusted-waste-other.csv"
)
# import pdb
# pdb.set_trace()
# pyam.compare(reclassify_waste_and_other_co2_ar6(pyam.IamDataFrame(input_emissions_file)), pyam.IamDataFrame(processed_input_emissions_file), )
> assert reclassify_waste_and_other_co2_ar6(
pyam.IamDataFrame(input_emissions_file)
).equals(pyam.IamDataFrame(processed_input_emissions_file))
tests\integration\test_units.py:194:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src\climate_assessment\checks.py:844: in reclassify_waste_and_other_co2_ar6
df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
objs = [<class 'pyam.core.IamDataFrame'>
Index:
* model : model12 (1)
* scenario : 1point5 (1)
Timeseries data coordinat... year : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
exclude (bool) False (1)]
ignore_meta_conflict = False, kwargs = {'ignore_meta': True}
as_iamdataframe = <function concat.<locals>.as_iamdataframe at 0x000001DB7FD2F1C0>
df = <class 'pyam.core.IamDataFrame'>
Index:
* model : model1, model10, model11, model13, model14, model2, ... model9 (... year : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
exclude (bool) False (1)
_merge_meta = True, index_names = FrozenList(['model', 'scenario']), extra_cols = ['exclude'], time_col = 'year'
consistent_time_domain = True
def concat(objs, ignore_meta_conflict=False, **kwargs):
"""Concatenate a series of IamDataFrame-like objects
Parameters
----------
objs : iterable of IamDataFrames
A list of objects castable to :class:`IamDataFrame`
ignore_meta_conflict : bool, optional
If False, raise an error if any meta columns present in `dfs` are not identical.
If True, values in earlier elements of `dfs` take precedence.
kwargs
Passed to :class:`IamDataFrame(other, **kwargs) <IamDataFrame>`
for any item of `dfs` which isn't already an IamDataFrame.
Returns
-------
IamDataFrame
Raises
------
TypeError
If `dfs` is not a list.
ValueError
If time domain or other timeseries data index dimension don't match.
Notes
-----
The *meta* attributes are merged only for those objects of *objs* that are passed
as :class:`IamDataFrame` instances.
The :attr:`dimensions` and :attr:`index` names of all elements of *dfs* must be
identical. The returned IamDataFrame inherits the dimensions and index names.
"""
if not islistable(objs) or isinstance(objs, pd.DataFrame):
raise TypeError(f"'{objs.__class__.__name__}' object is not iterable")
objs = list(objs)
if len(objs) < 1:
raise ValueError("No objects to concatenate")
def as_iamdataframe(df):
if isinstance(df, IamDataFrame):
return df, True
else:
return IamDataFrame(df, **kwargs), False
# cast first item to IamDataFrame (if necessary)
df, _merge_meta = as_iamdataframe(objs[0])
index_names, extra_cols, time_col = df.index.names, df.extra_cols, df.time_col
consistent_time_domain = True
iam_dfs = [(df, _merge_meta)]
# cast all items to IamDataFrame (if necessary) and check consistency of items
for df in objs[1:]:
df, _merge_meta = as_iamdataframe(df)
if df.index.names != index_names:
raise ValueError("Items have incompatible index dimensions.")
if df.extra_cols != extra_cols:
> raise ValueError("Items have incompatible timeseries data dimensions.")
E ValueError: Items have incompatible timeseries data dimensions.
..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2931: ValueError
___________________________________________________ test_reclassify_co2_ar6_sum ___________________________________________________
def test_reclassify_co2_ar6_sum():
input_emissions_file = pyam.IamDataFrame(os.path.join(TEST_DATA_DIR, "ex2.csv"))
> input_emissions_file_processed = reclassify_waste_and_other_co2_ar6(
input_emissions_file
)
tests\integration\test_units.py:202:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src\climate_assessment\checks.py:844: in reclassify_waste_and_other_co2_ar6
df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
objs = [<class 'pyam.core.IamDataFrame'>
Index:
* model : model12 (1)
* scenario : 1point5 (1)
Timeseries data coordinat... year : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
exclude (bool) False (1)]
ignore_meta_conflict = False, kwargs = {'ignore_meta': True}
as_iamdataframe = <function concat.<locals>.as_iamdataframe at 0x000001DB018A48B0>
df = <class 'pyam.core.IamDataFrame'>
Index:
* model : model1, model10, model11, model13, model14, model2, ... model9 (... year : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
exclude (bool) False (1)
_merge_meta = True, index_names = FrozenList(['model', 'scenario']), extra_cols = ['exclude'], time_col = 'year'
consistent_time_domain = True
def concat(objs, ignore_meta_conflict=False, **kwargs):
"""Concatenate a series of IamDataFrame-like objects
Parameters
----------
objs : iterable of IamDataFrames
A list of objects castable to :class:`IamDataFrame`
ignore_meta_conflict : bool, optional
If False, raise an error if any meta columns present in `dfs` are not identical.
If True, values in earlier elements of `dfs` take precedence.
kwargs
Passed to :class:`IamDataFrame(other, **kwargs) <IamDataFrame>`
for any item of `dfs` which isn't already an IamDataFrame.
Returns
-------
IamDataFrame
Raises
------
TypeError
If `dfs` is not a list.
ValueError
If time domain or other timeseries data index dimension don't match.
Notes
-----
The *meta* attributes are merged only for those objects of *objs* that are passed
as :class:`IamDataFrame` instances.
The :attr:`dimensions` and :attr:`index` names of all elements of *dfs* must be
identical. The returned IamDataFrame inherits the dimensions and index names.
"""
if not islistable(objs) or isinstance(objs, pd.DataFrame):
raise TypeError(f"'{objs.__class__.__name__}' object is not iterable")
objs = list(objs)
if len(objs) < 1:
raise ValueError("No objects to concatenate")
def as_iamdataframe(df):
if isinstance(df, IamDataFrame):
return df, True
else:
return IamDataFrame(df, **kwargs), False
# cast first item to IamDataFrame (if necessary)
df, _merge_meta = as_iamdataframe(objs[0])
index_names, extra_cols, time_col = df.index.names, df.extra_cols, df.time_col
consistent_time_domain = True
iam_dfs = [(df, _merge_meta)]
# cast all items to IamDataFrame (if necessary) and check consistency of items
for df in objs[1:]:
df, _merge_meta = as_iamdataframe(df)
if df.index.names != index_names:
raise ValueError("Items have incompatible index dimensions.")
if df.extra_cols != extra_cols:
> raise ValueError("Items have incompatible timeseries data dimensions.")
E ValueError: Items have incompatible timeseries data dimensions.
..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2931: ValueError
======================================================== warnings summary =========================================================
..\..\..\AppData\Roaming\Python\Python310\site-packages\jupyter_client\connect.py:20
C:\Users\kikstra\AppData\Roaming\Python\Python310\site-packages\jupyter_client\connect.py:20: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
given by the platformdirs library. To remove this warning and
see the appropriate new directories, set the environment variable
`JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
The use of platformdirs will be the default in `jupyter_core` v6
from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write
..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\openscm_runner\adapters\ciceroscm_adapter\ciceroscm_wrapper.py:10
C:\Users\kikstra\AppData\Local\Programs\Python\Python310\lib\site-packages\openscm_runner\adapters\ciceroscm_adapter\ciceroscm_wrapper.py:10: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils import dir_util
tests/integration/test_units.py::test_reclassify_co2_ar6
tests/integration/test_units.py::test_reclassify_co2_ar6_sum
c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\checks.py:836: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_change_pd = df_change_pd.sum()
tests/integration/test_units.py::test_reclassify_co2_ar6
tests/integration/test_units.py::test_reclassify_co2_ar6_sum
C:\Users\kikstra\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2958: FutureWarning: Behavior when concatenating bool-dtype and numeric-dtype arrays is deprecated; in a future version these will cast to object dtype (instead of coercing bools to numeric values). To retain the old behavior, explicitly cast bool-dtype arrays to numeric dtype.
pd.concat(ret_data, verify_integrity=False),
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================================== short test summary info =====================================================
FAILED tests/integration/test_units.py::test_reclassify_co2_ar6 - ValueError: Items have incompatible timeseries data dimensions.
FAILED tests/integration/test_units.py::test_reclassify_co2_ar6_sum - ValueError: Items have incompatible timeseries data dimensions.
============================================ 2 failed, 21 passed, 6 warnings in 6.20s =============================================
(ca-deps-311) C:\Users\kikstra\Documents\GitHub\climate-assessment>
Closing in favour of #50
Continuation of #46 (don't merge both)
CHANGELOG.rst
added (single line such as:(`#XX <https://github.com/iiasa/climate-assessment/pull/XX>`_) Added feature which does something
)