Closed danielhuppmann closed 1 year ago
Merging #580 (179e813) into main (151e330) will increase coverage by
0.0%
. The diff coverage is94.4%
.
@@ Coverage Diff @@
## main #580 +/- ##
=====================================
Coverage 93.7% 93.7%
=====================================
Files 50 50
Lines 5339 5348 +9
=====================================
+ Hits 5004 5013 +9
Misses 335 335
Impacted Files | Coverage Δ | |
---|---|---|
pyam/plotting.py | 92.9% <50.0%> (+<0.1%) |
:arrow_up: |
pyam/utils.py | 91.8% <95.4%> (+<0.1%) |
:arrow_up: |
pyam/core.py | 94.3% <100.0%> (ø) |
|
pyam/index.py | 98.0% <100.0%> (ø) |
|
pyam/logging.py | 64.8% <100.0%> (+5.4%) |
:arrow_up: |
pyam/time.py | 96.0% <100.0%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 151e330...179e813. Read the comment docs.
Summarizing bilateral discussions with (and manual benchmarking done by) @phackstock - this PR shows again some improvements in memory usage. One interesting observation is that the initial commit here https://github.com/danielhuppmann/pyam/commit/d652297fc39c9176be661bb508d97d42d02b795b, which uses df.set_index(.., append=True)
performs much worse than either the previous implementation or the "manual" adding-to-index using the pyam.index module...
Running benchmarking with pytest-monitor and memory-profiler on the IAMC 1.5°C scenario ensemble data for all regions (~80MB, xlsx) shows that this PR increases time use by ~20%, but reduces memory use by 30%... Not quite sure if that is a worthwhile trade-off, or if it can be improved...
Regarding your question @danielhuppmann my vote would be in favor of saving memory even if the price for that is a longer execution time. My reasoning is that a longer execution time means more waiting for the user while memory savings can decide whether or not a user might be able to open a data set at all. Ideally, if you're working in a jupyter notebook you read the data only once and keep it in memory anyway so I don't think that a plus in execution time is that big of a deal.
closing in favor of #729 and #730
Please confirm that this PR has done the following:
Description of PR
This PR is a follow-up to #579, implementing further performance improvements.