iiasa / climate-assessment

https://climate-assessment.readthedocs.io/en/latest
MIT License
20 stars 18 forks source link

Update dependencies #47

Closed znicholls closed 10 months ago

znicholls commented 1 year ago

Continuation of #46 (don't merge both)

znicholls commented 1 year ago

@phackstock I got some of the way but then crashed into https://github.com/IAMconsortium/pyam/issues/793 which I think is the reason that the remaining tests are failing (because it causes a regression in the behaviour of reclassify_waste_and_other_co2_ar6)

phackstock commented 1 year ago

@jkikstra, if you could take this one that would be great. I'll need to update our Scenario Explorer processing container soon and that will require pandas 2.0 which currently breaks those tests.

jkikstra commented 1 year ago

Thanks for the initial work and figuring out the issue! Will try to tackle this at the end of this week.

phackstock commented 1 year ago

Perfect, thanks a lot @jkikstra!

jkikstra commented 1 year ago

Started working on it now, but stopping for today.

Starting a list here for changes that I think we need to be making:

@danielhuppmann I'm getting this:

def reclassify_waste_and_other_co2_ar6(df):
    """
    Reclassify waste and other CO2 into the energy and industrial processes category

    Reclassify CO2 emissions reported under Emissions|CO2|Other and
    Emissions|CO2|Waste, instead putting them under
    Emissions|CO2|Energy and Industrial Processes.

    Parameters
    ----------
    df : :class:`pyam.IamDataFrame`
        The original set of reported emissions

    Returns
    -------
    :class:`pyam.IamDataFrame`
        Reclassified set of emissions.
    """
    # filter out the scenarios that do not need changes
    df_nochange = df.copy()
    df_nochange.require_data(
        variable=["Emissions|CO2|Other", "Emissions|CO2|Waste"], exclude_on_fail=True
    )
    df_nochange.filter(exclude=True, inplace=True)
    df_nochange.reset_exclude()

    # select the scenarios that do need changes
    df_change = df.copy()
    df_change.require_data(
        variable=["Emissions|CO2|Other", "Emissions|CO2|Waste"], exclude_on_fail=True
    )
    df_change.filter(exclude=False, inplace=True)
    df_change.reset_exclude()
    if df_change.empty:
        return df_nochange

    # rename old CO2|Energy and Industrial Processes, to be replaced
    df_change.rename(
        variable={
            "Emissions|CO2|Energy and Industrial Processes": "Emissions|CO2|Energy and Industrial Processes|Incomplete"
        },
        inplace=True,
    )

    # use pandas to create new CO2|Energy and Industrial Processes by adding CO2|Other and CO2|Waste
    df_change_pd = df_change.as_pandas()
    varsum = [
        "Emissions|CO2|Waste",
        "Emissions|CO2|Other",
        "Emissions|CO2|Energy and Industrial Processes|Incomplete",
    ]
    df_change_notaffected_pd = df_change_pd[~df_change_pd.variable.isin(varsum)]
    df_change_notaffected_pyam = pyam.IamDataFrame(df_change_notaffected_pd)
    df_change_pd = df_change_pd[df_change_pd.variable.isin(varsum)]
    df_change_pd = df_change_pd.groupby(
        by=["model", "scenario", "year"], as_index=False
    )
    df_change_pd = df_change_pd.sum()
    df_change_pd["variable"] = "Emissions|CO2|Energy and Industrial Processes"
    df_change_pd["unit"] = "Mt CO2/yr"
    df_change_pd["region"] = "World"
    df_change_pyam = pyam.IamDataFrame(df_change_pd)
    df_change = pyam.concat([df_change_pyam, df_change_notaffected_pyam])

    # recombine dataframes
    df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)

    return df_new

test results:

(ca-deps-311) C:\Users\kikstra\Documents\GitHub\climate-assessment>pytest tests/integration/test_units.py
======================================================= test session starts =======================================================
platform win32 -- Python 3.10.2, pytest-7.2.1, pluggy-1.0.0
rootdir: C:\Users\kikstra\Documents\GitHub\climate-assessment, configfile: setup.cfg
plugins: anyio-3.7.1
collected 23 items

tests\integration\test_units.py .....................FF                                                                      [100%]

============================================================ FAILURES =============================================================
_____________________________________________________ test_reclassify_co2_ar6 _____________________________________________________

    def test_reclassify_co2_ar6():
        input_emissions_file = os.path.join(TEST_DATA_DIR, "ex2.csv")
        processed_input_emissions_file = os.path.join(
            TEST_DATA_DIR, "ex2_adjusted-waste-other.csv"
        )
        # import pdb
        # pdb.set_trace()
        # pyam.compare(reclassify_waste_and_other_co2_ar6(pyam.IamDataFrame(input_emissions_file)), pyam.IamDataFrame(processed_input_emissions_file), )
>       assert reclassify_waste_and_other_co2_ar6(
            pyam.IamDataFrame(input_emissions_file)
        ).equals(pyam.IamDataFrame(processed_input_emissions_file))

tests\integration\test_units.py:194:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src\climate_assessment\checks.py:844: in reclassify_waste_and_other_co2_ar6
    df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

objs = [<class 'pyam.core.IamDataFrame'>
Index:
 * model    : model12 (1)
 * scenario : 1point5 (1)
Timeseries data coordinat... year     : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
   exclude (bool) False (1)]
ignore_meta_conflict = False, kwargs = {'ignore_meta': True}
as_iamdataframe = <function concat.<locals>.as_iamdataframe at 0x000001DB7FD2F1C0>
df = <class 'pyam.core.IamDataFrame'>
Index:
 * model    : model1, model10, model11, model13, model14, model2, ... model9 (...  year     : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
   exclude (bool) False (1)
_merge_meta = True, index_names = FrozenList(['model', 'scenario']), extra_cols = ['exclude'], time_col = 'year'
consistent_time_domain = True

    def concat(objs, ignore_meta_conflict=False, **kwargs):
        """Concatenate a series of IamDataFrame-like objects

        Parameters
        ----------
        objs : iterable of IamDataFrames
            A list of objects castable to :class:`IamDataFrame`
        ignore_meta_conflict : bool, optional
            If False, raise an error if any meta columns present in `dfs` are not identical.
            If True, values in earlier elements of `dfs` take precedence.
        kwargs
            Passed to :class:`IamDataFrame(other, **kwargs) <IamDataFrame>`
            for any item of `dfs` which isn't already an IamDataFrame.

        Returns
        -------
        IamDataFrame

        Raises
        ------
        TypeError
            If `dfs` is not a list.
        ValueError
            If time domain or other timeseries data index dimension don't match.

        Notes
        -----
        The *meta* attributes are merged only for those objects of *objs* that are passed
        as :class:`IamDataFrame` instances.

        The :attr:`dimensions` and :attr:`index` names of all elements of *dfs* must be
        identical. The returned IamDataFrame inherits the dimensions and index names.
        """
        if not islistable(objs) or isinstance(objs, pd.DataFrame):
            raise TypeError(f"'{objs.__class__.__name__}' object is not iterable")

        objs = list(objs)
        if len(objs) < 1:
            raise ValueError("No objects to concatenate")

        def as_iamdataframe(df):
            if isinstance(df, IamDataFrame):
                return df, True
            else:
                return IamDataFrame(df, **kwargs), False

        # cast first item to IamDataFrame (if necessary)
        df, _merge_meta = as_iamdataframe(objs[0])
        index_names, extra_cols, time_col = df.index.names, df.extra_cols, df.time_col

        consistent_time_domain = True
        iam_dfs = [(df, _merge_meta)]

        # cast all items to IamDataFrame (if necessary) and check consistency of items
        for df in objs[1:]:
            df, _merge_meta = as_iamdataframe(df)
            if df.index.names != index_names:
                raise ValueError("Items have incompatible index dimensions.")
            if df.extra_cols != extra_cols:
>               raise ValueError("Items have incompatible timeseries data dimensions.")
E               ValueError: Items have incompatible timeseries data dimensions.

..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2931: ValueError
___________________________________________________ test_reclassify_co2_ar6_sum ___________________________________________________

    def test_reclassify_co2_ar6_sum():
        input_emissions_file = pyam.IamDataFrame(os.path.join(TEST_DATA_DIR, "ex2.csv"))

>       input_emissions_file_processed = reclassify_waste_and_other_co2_ar6(
            input_emissions_file
        )

tests\integration\test_units.py:202:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src\climate_assessment\checks.py:844: in reclassify_waste_and_other_co2_ar6
    df_new = pyam.concat([df_change, df_nochange], ignore_meta=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

objs = [<class 'pyam.core.IamDataFrame'>
Index:
 * model    : model12 (1)
 * scenario : 1point5 (1)
Timeseries data coordinat... year     : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
   exclude (bool) False (1)]
ignore_meta_conflict = False, kwargs = {'ignore_meta': True}
as_iamdataframe = <function concat.<locals>.as_iamdataframe at 0x000001DB018A48B0>
df = <class 'pyam.core.IamDataFrame'>
Index:
 * model    : model1, model10, model11, model13, model14, model2, ... model9 (...  year     : 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, ... 2100 (19)
Meta indicators:
   exclude (bool) False (1)
_merge_meta = True, index_names = FrozenList(['model', 'scenario']), extra_cols = ['exclude'], time_col = 'year'
consistent_time_domain = True

    def concat(objs, ignore_meta_conflict=False, **kwargs):
        """Concatenate a series of IamDataFrame-like objects

        Parameters
        ----------
        objs : iterable of IamDataFrames
            A list of objects castable to :class:`IamDataFrame`
        ignore_meta_conflict : bool, optional
            If False, raise an error if any meta columns present in `dfs` are not identical.
            If True, values in earlier elements of `dfs` take precedence.
        kwargs
            Passed to :class:`IamDataFrame(other, **kwargs) <IamDataFrame>`
            for any item of `dfs` which isn't already an IamDataFrame.

        Returns
        -------
        IamDataFrame

        Raises
        ------
        TypeError
            If `dfs` is not a list.
        ValueError
            If time domain or other timeseries data index dimension don't match.

        Notes
        -----
        The *meta* attributes are merged only for those objects of *objs* that are passed
        as :class:`IamDataFrame` instances.

        The :attr:`dimensions` and :attr:`index` names of all elements of *dfs* must be
        identical. The returned IamDataFrame inherits the dimensions and index names.
        """
        if not islistable(objs) or isinstance(objs, pd.DataFrame):
            raise TypeError(f"'{objs.__class__.__name__}' object is not iterable")

        objs = list(objs)
        if len(objs) < 1:
            raise ValueError("No objects to concatenate")

        def as_iamdataframe(df):
            if isinstance(df, IamDataFrame):
                return df, True
            else:
                return IamDataFrame(df, **kwargs), False

        # cast first item to IamDataFrame (if necessary)
        df, _merge_meta = as_iamdataframe(objs[0])
        index_names, extra_cols, time_col = df.index.names, df.extra_cols, df.time_col

        consistent_time_domain = True
        iam_dfs = [(df, _merge_meta)]

        # cast all items to IamDataFrame (if necessary) and check consistency of items
        for df in objs[1:]:
            df, _merge_meta = as_iamdataframe(df)
            if df.index.names != index_names:
                raise ValueError("Items have incompatible index dimensions.")
            if df.extra_cols != extra_cols:
>               raise ValueError("Items have incompatible timeseries data dimensions.")
E               ValueError: Items have incompatible timeseries data dimensions.

..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2931: ValueError
======================================================== warnings summary =========================================================
..\..\..\AppData\Roaming\Python\Python310\site-packages\jupyter_client\connect.py:20
  C:\Users\kikstra\AppData\Roaming\Python\Python310\site-packages\jupyter_client\connect.py:20: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
  given by the platformdirs library.  To remove this warning and
  see the appropriate new directories, set the environment variable
  `JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
  The use of platformdirs will be the default in `jupyter_core` v6
    from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write

..\..\..\AppData\Local\Programs\Python\Python310\lib\site-packages\openscm_runner\adapters\ciceroscm_adapter\ciceroscm_wrapper.py:10
  C:\Users\kikstra\AppData\Local\Programs\Python\Python310\lib\site-packages\openscm_runner\adapters\ciceroscm_adapter\ciceroscm_wrapper.py:10: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
    from distutils import dir_util

tests/integration/test_units.py::test_reclassify_co2_ar6
tests/integration/test_units.py::test_reclassify_co2_ar6_sum
  c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\checks.py:836: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
    df_change_pd = df_change_pd.sum()

tests/integration/test_units.py::test_reclassify_co2_ar6
tests/integration/test_units.py::test_reclassify_co2_ar6_sum
  C:\Users\kikstra\AppData\Local\Programs\Python\Python310\lib\site-packages\pyam\core.py:2958: FutureWarning: Behavior when concatenating bool-dtype and numeric-dtype arrays is deprecated; in a future version these will cast to object dtype (instead of coercing bools to numeric values). To retain the old behavior, explicitly cast bool-dtype arrays to numeric dtype.
    pd.concat(ret_data, verify_integrity=False),

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===================================================== short test summary info =====================================================
FAILED tests/integration/test_units.py::test_reclassify_co2_ar6 - ValueError: Items have incompatible timeseries data dimensions.
FAILED tests/integration/test_units.py::test_reclassify_co2_ar6_sum - ValueError: Items have incompatible timeseries data dimensions.
============================================ 2 failed, 21 passed, 6 warnings in 6.20s =============================================

(ca-deps-311) C:\Users\kikstra\Documents\GitHub\climate-assessment>
znicholls commented 10 months ago

Closing in favour of #50