IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
226 stars 118 forks source link

Improve performance of IamDataFrame initialization #579

Closed danielhuppmann closed 3 years ago

danielhuppmann commented 3 years ago

Please confirm that this PR has done the following:

Description of PR

This PR implement faster and less memory-intensive initialization of an IamDataframe by doing the sorting and validation on the indexed pd.Series instead of the DataFrame. Based on memory-profiling work by @phackstock.

codecov[bot] commented 3 years ago

Codecov Report

Merging #579 (7e6b8df) into main (5f10925) will increase coverage by 0.0%. The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #579   +/-   ##
=====================================
  Coverage   93.7%   93.7%           
=====================================
  Files         50      50           
  Lines       5336    5339    +3     
=====================================
+ Hits        5001    5004    +3     
  Misses       335     335           
Impacted Files Coverage Δ
pyam/utils.py 91.7% <100.0%> (+<0.1%) :arrow_up:
tests/test_core.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5f10925...7e6b8df. Read the comment docs.

danielhuppmann commented 3 years ago

Thanks to @sandrinecharousset who reported an "out-of-memory" issue when loading large files as IamDataFrame - which motivated @phackstock and me to dig a bit into the performance...

danielhuppmann commented 3 years ago

@gidden @znicholls @Rlamboll @coroa, any thoughts? If there are no objections, I'll merge next week.

Rlamboll commented 3 years ago

Looks good to me and it's always good to speed things up a little!

znicholls commented 3 years ago

Right I've now read the whole thing so have answered my own question, all good

danielhuppmann commented 3 years ago

Thanks @phackstock for the effort, thanks @znicholls & @Rlamboll for taking a look at the implementation!