IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios

https://pyam-iamc.readthedocs.io/

Apache License 2.0

226 stars 118 forks source link

Improve performance of IamDataFrame initialization #579

Closed danielhuppmann closed 3 years ago

danielhuppmann commented 3 years ago

Please confirm that this PR has done the following:

~Tests Added~ -> no functional changes, existing tests should cover all cases where something could go wrong
[x] Name of contributors Added to AUTHORS.rst
[x] Description in RELEASE_NOTES.md Added

Description of PR

This PR implement faster and less memory-intensive initialization of an IamDataframe by doing the sorting and validation on the indexed pd.Series instead of the DataFrame. Based on memory-profiling work by @phackstock.

codecov[bot] commented 3 years ago

Codecov Report

Merging #579 (7e6b8df) into main (5f10925) will increase coverage by 0.0%. The diff coverage is 100.0%.

@@          Coverage Diff          @@
##            main    #579   +/-   ##
=====================================
  Coverage   93.7%   93.7%           
=====================================
  Files         50      50           
  Lines       5336    5339    +3     
=====================================
+ Hits        5001    5004    +3     
  Misses       335     335

Impacted Files	Coverage Δ
pyam/utils.py	`91.7% <100.0%> (+<0.1%)`	:arrow_up:
tests/test_core.py	`100.0% <100.0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5f10925...7e6b8df. Read the comment docs.

danielhuppmann commented 3 years ago

Thanks to @sandrinecharousset who reported an "out-of-memory" issue when loading large files as IamDataFrame - which motivated @phackstock and me to dig a bit into the performance...

danielhuppmann commented 3 years ago

@gidden @znicholls @Rlamboll @coroa, any thoughts? If there are no objections, I'll merge next week.

Rlamboll commented 3 years ago

Looks good to me and it's always good to speed things up a little!

znicholls commented 3 years ago

Right I've now read the whole thing so have answered my own question, all good

danielhuppmann commented 3 years ago

Thanks @phackstock for the effort, thanks @znicholls & @Rlamboll for taking a look at the implementation!