Closed khaeru closed 2 years ago
Merging #441 (cdd8aa9) into main (0c1a143) will increase coverage by
0.0%
. The diff coverage is100.0%
.
@@ Coverage Diff @@
## main #441 +/- ##
=====================================
Coverage 98.5% 98.5%
=====================================
Files 41 42 +1
Lines 4380 4382 +2
=====================================
+ Hits 4318 4320 +2
Misses 62 62
Impacted Files | Coverage Δ | |
---|---|---|
ixmp/backend/io.py | 98.5% <ø> (-0.1%) |
:arrow_down: |
ixmp/backend/jdbc.py | 95.1% <100.0%> (-0.1%) |
:arrow_down: |
ixmp/core/scenario.py | 97.9% <100.0%> (-0.1%) |
:arrow_down: |
ixmp/tests/test_perf.py | 100.0% <100.0%> (ø) |
I have been able to test the PR, but not via the above suggested command pytest -k "transport and build_bare"
. Instead I ran the GLOBIOM emulator. On main
running the emulator script for the global model, of which most time is taken up by populating the table land_output
, takes about 7minutes. On this branch, the process time is almost identical.
The main change to still be made is to adjust the logging level - currently this seems to be set to "DEBUG".
Instead I ran the GLOBIOM emulator. On
main
running the emulator script for the global model, of which most time is taken up by populating the tableland_output
, takes about 7minutes. On this branch, the process time is almost identical.
Thanks for checking.
Some more information on investigating performance:
Above I mentioned using pytest-profiling. Specifically, I run 1 or a few tests with pytest -k "this or that" --profile-svg
and then open the file prof/combined.svg that's created. An example is attached below. Here, along the central (green) chain is a box scenario:429:add_par
. The info says it is run 28 times ("28×") which together take up 48.32% of the total time of this pytest run.
Using this, my process is roughly:
If (4) is successful, the % of total time for the slow code will decrease, and the % for other (next-slowest, unaltered) codes will correspondingly increase.
The example reflects a particular MESSAGEix-Transport test in which add_par() was the slowest piece of code. But we have a lot of code in message_data written using loops, etc.; for these, it may not be add_par() which is the slowest step, and thus improving performance of add_par() will not have a large or even noticeable impact on total runtime.
It is necessary to do these kinds of investigations (again, I am not an expert) to identify whether ixmp or downstream code is responsible for poor performance. Also, when ixmp is responsible, sometimes this is because of the Java code in JDBCBackend or JPype used to interact with it, and so there we cannot make any improvements either. In short, the scope for these optimizations is quite narrow.
I have been able to test the PR, but not via the above suggested command
pytest -k "transport and build_bare"
. Instead I ran the GLOBIOM emulator. Onmain
running the emulator script for the global model, of which most time is taken up by populating the tableland_output
, takes about 7minutes. On this branch, the process time is almost identical. The main change to still be made is to adjust the logging level - currently this seems to be set to "DEBUG".
Thanks for checking Oliver! I requested you as a reviewer and removed me.
pytest -k "transport and build_bare"
on main took 8.72s while on this branch it took 7.87s.
The only change before merging is to set the logging level back, so that DEBUG messages are not shown.
@OFR-IIASA with https://github.com/iiasa/ixmp/pull/441#discussion_r840491256 addressed this should be ready for ✅.
This applies some simple optimizations in
Scenario.add_par()
and underlying backend code by using pandas' (fast) internals rather than (slower) utilities/code inixmp.utils
.Anecdotally (using pytest-profiling) there is about a 50% speed-up in
message_data
test cases where the runtime is dominated byadd_par()
.The branch also adds
.tests.test_perf.test_add_par()
using pytest-benchmark; these are invoked withpytest -k perf --benchmark-only
. While I am not experienced with performance profiling, these results seem to show consistently better performance.How to review
pytest -k "transport and build_bare"
.main
or the last release.PR checklist