iiasa / climate-assessment

MIT License
19 stars 18 forks source link

Unnecessary error when input data has no non-co2 variables #19

Open jkikstra opened 1 year ago

jkikstra commented 1 year ago

Desired behaviour The current code should allow for a scenario to be run with only Emissions|CO2|Energy and Industrial Processes or even only Emissions|CO2.

Issue If running based on an emissions file where all scenarios only have one of these variables (only "Emissions|CO2*"), then we get the following error (I used run-example-fair.ipynb to produce this error, with changed input data EMISSIONS_INPUT_FILE = "ar6_minimum_emissions.csv") :


2022-12-12 11:55:49 climate_assessment.cli MainThread - INFO:  Outputs will be saved in: ..\data\output-fair-example-notebook
2022-12-12 11:55:49 climate_assessment.cli MainThread - INFO:  Outputs will be saved with the ID: ar6_minimum_emissions
2022-12-12 11:55:49 climate_assessment.cli MainThread - INFO:  Loading ..\tests\test-data\ar6_minimum_emissions.csv
2022-12-12 11:55:49 pyam.core MainThread - INFO:  Reading file ..\tests\test-data\ar6_minimum_emissions.csv
2022-12-12 11:55:49 climate_assessment.cli MainThread - INFO:  Converting to basic columns i.e. removing any extra columns
2022-12-12 11:55:49 climate_assessment.cli MainThread - INFO:  Performing input data checks
2022-12-12 11:55:49 climate_assessment.checks MainThread - INFO:  CHECK: if no non-co2 negatives are reported.
2022-12-12 11:55:49 pyam.core MainThread - WARNING:  Filtered IamDataFrame is empty!


c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\checks.py in perform_input_checks(df, output_csv_files, output_filename, lead_variable_check, historical_check, reporting_completeness_check, outdir)
    876     LOGGER.info("CHECK: if no non-co2 negatives are reported.")
--> 877     df = check_negatives(df, output_filename, outdir=outdir)
    879     LOGGER.info("CHECK: report emissions for all minimally required years.")

c:\users\kikstra\documents\github\climate-assessment\src\climate_assessment\checks.py in check_negatives(df, filename, negativethreshold, outdir, prefix)
    558     # set small non-negative non-CO2 values to zero
    559     df_co2 = df.filter(variable=f"{prefix}Emissions|CO2*").timeseries()
--> 560     df_nonco2 = df.filter(variable=f"{prefix}Emissions|CO2*", keep=False).timeseries()
    561     df_nonco2 = df_nonco2.where(
    562         (df_nonco2 > 0) | (df_nonco2 < negativethreshold) | df_nonco2.isnull(), other=0

~\.conda\envs\ca-testing\lib\site-packages\pyam\core.py in timeseries(self, iamc_index)
    782         """
    783         if self.empty:
--> 784             raise ValueError("This IamDataFrame is empty!")
    786         s = self._data

ValueError: This IamDataFrame is empty!

Proposed minimum solution Add if-statement(s) where necessary, like: if not df.filter(variable=f"{prefix}Emissions|CO2*", keep=False).timeseries()

Proposed ideal solution Add a test that takes in a minimum emissions file like ar6_minimum_emissions.csv, and that checks either: i. whether all checks are passed OR ii. that a complete infilled emissions set is provided based on this input.

znicholls commented 1 year ago

Nicely described @jkikstra. I think the fix proposed sounds good, but wanted to ask a clarifying question to help make communication easier moving forward.

Should we call the test file ar6_minimum_emissions.csv? I thought for AR6, the minimum emissions also included CH4 and N2O and the checks would fail if all that was there? Should we just call the test file minimum_emissions.csv and when we do the next release bump the major version to make clear that scenarios which would have failed AR6 will now pass?