iiasa / message_ix

The integrated assessment and energy systems model MESSAGEix
https://docs.messageix.org
Apache License 2.0
116 stars 152 forks source link

Java heap space issues #455

Open OFR-IIASA opened 3 years ago

OFR-IIASA commented 3 years ago

The issue of running out of Java Heap Space, results when scenarios are loaded sequentially. A scenario is loaded, data is retrieved from the scenario (no checkout/commit is performed), data is then processed. (details below). A maximum of 12 scenarios can be processed, before memory issues are encountered. For each scenario which is loaded, more memory is required as opposed to memory being freed up when the routine for one scenario is completed, which would be available for the next scenario which is processed.


Code sample or context

I have made a small program (module_x) that uses some functions from the "old" reporting in message_data. Module_x process (python):

  1. imports message_ix/ixmp
  2. load ixmp.Platform() -> assigned to variable "mp"
  3. loads a scenario (without cache) -> assigned to variable "scen"
  4. retrieves 3 dataframes using "old" reporting functions.
  5. loads 1 variable stored as a timeseries with the scenarios object.
  6. performs a simple data manipulation.
  7. close mp
  8. set variable "scen" = None
  9. set variable "mp" = None

Module_x is called from an ipynb (>160 scenarios). The jupyter notebook loops over the scenario list and does nothing else except loop over scenario names and for each scenario

  1. import module_x
  2. call module_x passing some variables (all strings)
  3. del module_x

Problem description

Memory issues occur after approx. 10-12 scenarios.

Versions

ixmp:        3.2.1.dev80+g40fc589
     40fc589 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'master' of https://github.com/iiasa/ixmp
message_ix:  3.2.1.dev67+ga20ffb0
     a20ffb0 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'master' of https://github.com/iiasa/message_ix
message_data: installed
     3bfb76b (HEAD -> RES_add_5_year_timesteps2, origin/RES_add_5_year_timesteps2) added configuration files for ENGAGE submission 20210331 (ENGAGE 4.1.7)

click:       7.1.2
dask:        2020.12.0
graphviz:    0.13.2
jpype:       1.2.1
… JVM path:  C:\Program Files\Java\jre1.8.0_231\bin\server\jvm.dll
openpyxl:    3.0.5
pandas:      1.1.3
pint:        0.11
xarray:      0.15.1
yaml:        5.3.1

iam_units:   installed
jupyter:     installed
matplotlib:  3.3.2
plotnine:    0.7.0
pyam:        0.7.0+4.gc1ed1f8
     c1ed1f8 (HEAD -> master, origin/master, origin/HEAD) Add a tutorial how to read data from GAMS gdx to pyam (#424)

GAMS:        33.1.0

python:      3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS:          Windows
OS-release:  10
machine:     AMD64
processor:   Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder:   little
LC_ALL:      None
LANG:        None
LOCALE:      None.None
khaeru commented 3 years ago