IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
234 stars 120 forks source link

Overly ambitious `groupby().sum()´ in `rename()` #182

Closed danielhuppmann closed 5 years ago

danielhuppmann commented 5 years ago

Description of the problem

We identified a silly mistake in the IAMC 1.5°C Scenario Data due to the over-ambitious use of groupby().sum() in the rename() function implementation. See https://github.com/iiasa/ipcc_sr15_scenario_analysis/issues/2 for details.

There were a number of variables with a typo, and when using rename() to correct that, I failed to check whether any variable without the typo also existed. This was the case for two variables, so the values are now double compared to what they should be.

Proposed solution

I would like to add a kwarg check_duplicate (default True) to the function, which raises an error message if any rename mapping leads to a conflict with the existing model-scenario-region-variable-unit-year index in the data. A user would need to consciously override the check to merge renamed and existing values as part of the rename. Idea: fail with a bang rather than return unexpected results.

The merging of data renamed to the same variable name should obviously always work.

@gidden, @OFR-IIASA, any comments?

gidden commented 5 years ago

Sounds reasonable to me. Probably easier to look at the implementation, as I don't 100% grok the issue.