IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
226 stars 118 forks source link

Allow IamDataFrame instance with mixed 'year' and 'datetime' domain #597

Closed danielhuppmann closed 2 years ago

danielhuppmann commented 2 years ago

Please confirm that this PR has done the following:

Description of PR

This PR is a first step towards implementation of option 3 described in #596: allowing an IamDataFrame to have a "time" domain that has both "datetime" and "year" (as integer) values. The scope of this PR is very limited, to only allow initializing an IamDataFrame instance that has such a time domain. Filtering or returning the data in wide timeseries format is not tackled here (and fails in any instance that has a mixed index), to keep the PR reviewable (if anyone is interested).

The PR is directed to a feature-branch that will collect further work.

codecov[bot] commented 2 years ago

Codecov Report

Merging #597 (852e374) into feature/year-and-datetime (9f84bcd) will decrease coverage by 0.0%. The diff coverage is 96.3%.

Impacted file tree graph

@@                     Coverage Diff                     @@
##           feature/year-and-datetime    #597     +/-   ##
===========================================================
- Coverage                       94.2%   94.1%   -0.1%     
===========================================================
  Files                             50      50             
  Lines                           5331    5358     +27     
===========================================================
+ Hits                            5022    5047     +25     
- Misses                           309     311      +2     
Impacted Files Coverage Δ
pyam/utils.py 91.4% <92.8%> (-0.4%) :arrow_down:
pyam/core.py 94.3% <100.0%> (ø)
pyam/index.py 98.2% <100.0%> (+0.1%) :arrow_up:
tests/test_index.py 100.0% <100.0%> (ø)
tests/test_utils.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 9f84bcd...852e374. Read the comment docs.

danielhuppmann commented 2 years ago

I compared performance of this branch against the current main with two large year-based dataset: slightly faster (~10%), but also higher memory use. This is because the consistency-check and casting to int/pd.Timestamp now operates on the index instead of the time column - but this requires creating a full copy of the index rather than a copy of only the time column.