metno / pyaerocom

Python tools for climate and air quality model evaluation
https://pyaerocom.readthedocs.io/
GNU General Public License v3.0
25 stars 13 forks source link

Memory reduction for emep-reporting #1213

Closed heikoklein closed 3 months ago

heikoklein commented 3 months ago

Change Summary

This PR fixes some memory issues discovered during work with emep-reporting.

This PR adds also a new requirement: psutil

Related issue number

The remaining issue is now reading eea-data which will keep all existing data per variable in memory, leading to ~55GB peak-memory usage, though during processing only 1-5G data are needed. This will be addressed in https://github.com/metno/pyaro-readers/issues/43

Checklist

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (9f4b8dc) to head (4bf3e26). Report is 486 commits behind head on main-dev.

Files with missing lines Patch % Lines
pyaerocom/io/ebas_file_index.py 62.50% 3 Missing :warning:
pyaerocom/io/cams2_83/reader.py 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main-dev #1213 +/- ## ============================================ - Coverage 79.31% 79.30% -0.01% ============================================ Files 131 131 Lines 20231 20236 +5 ============================================ + Hits 16046 16048 +2 - Misses 4185 4188 +3 ``` | [Flag](https://app.codecov.io/gh/metno/pyaerocom/pull/1213/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=metno) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/metno/pyaerocom/pull/1213/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=metno) | `79.30% <73.33%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=metno#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

heikoklein commented 3 months ago

@charlienegri : I add here again the time-chunking with chunks={"time": 24}. We tried that with CAMS2_83 and time=1 and while it gave good improvements on memory-consumption, it took slightly longer. With time=24, I couldn't see any performance degration, and the memory reduction of daily hourly data vs yearly hourly data is still huge. So, you might want to try that approach again?

charlienegri commented 3 months ago

@charlienegri : I add here again the time-chunking with chunks={"time": 24}. We tried that with CAMS2_83 and time=1 and while it gave good improvements on memory-consumption, it took slightly longer. With time=24, I couldn't see any performance degration, and the memory reduction of daily hourly data vs yearly hourly data is still huge. So, you might want to try that approach again?

I have installed the branch in the test module and will try a long run with it
if it gets merged it will anyway be deployed to production in july as part of a new version of the module

charlienegri commented 3 months ago

@charlienegri : I add here again the time-chunking with chunks={"time": 24}. We tried that with CAMS2_83 and time=1 and while it gave good improvements on memory-consumption, it took slightly longer. With time=24, I couldn't see any performance degration, and the memory reduction of daily hourly data vs yearly hourly data is still huge. So, you might want to try that approach again?

I have installed the branch in the test module and will try a long run with it if it gets merged it will anyway be deployed to production in july as part of a new version of the module

test run crashed immediately with

Loading cams2_83-evaluation/test
  Loading requirement: proj/9.1.0
Traceback (most recent call last):
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/bin/cams2_83", line 5, in <module>
    from pyaerocom.scripts.cams2_83.cli import app
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/pyaerocom/__init__.py", line 9, in <module>
    from .config import Config
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/pyaerocom/config.py", line 19, in <module>
    from pyaerocom.grid_io import GridIO
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/pyaerocom/grid_io.py", line 2, in <module>
    from pyaerocom.time_config import TS_TYPES
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/pyaerocom/time_config.py", line 7, in <module>
    from iris import coord_categorisation
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/iris/coord_categorisation.py", line 23, in <module>
    import iris.coords
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/iris/coords.py", line 23, in <module>
    from iris.common import (
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/iris/common/__init__.py", line 9, in <module>
    from .mixin import *
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/iris/common/mixin.py", line 10, in <module>
    import cf_units
  File "/modules/rhel8/user-apps/fou-modules/cams2_83-evaluation/test/venv/lib/python3.10/site-packages/cf_units/__init__.py", line 23, in <module>
    from cf_units import _udunits2 as _ud
  File "cf_units/_udunits2.pyx", line 1, in init cf_units._udunits2
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

I will try a fresh venv

charlienegri commented 3 months ago

same result in a fresh venv

heikoklein commented 3 months ago

same result in a fresh venv

The numpy/environment error might be related to #1206

Rather than asking you to run this branch (which improves emep-reporting mem-usage), I wanted to ask you about testing the same idea in the cams2_83 reader, i.e. add chunks={"time": 24} to https://github.com/metno/pyaerocom/blob/26e8cdb79e7e0a99c13e370da51336cbbea44431/pyaerocom/io/cams2_83/reader.py#L193

charlienegri commented 3 months ago

same result in a fresh venv

The numpy/environment error might be related to #1206

Rather than asking you to run this branch (which improves emep-reporting mem-usage), I wanted to ask you about testing the same idea in the cams2_83 reader, i.e. add chunks={"time": 24} to

https://github.com/metno/pyaerocom/blob/26e8cdb79e7e0a99c13e370da51336cbbea44431/pyaerocom/io/cams2_83/reader.py#L193

I see, I will try that instead
the issue I had is the same that you mentioned

charlienegri commented 3 months ago

same result in a fresh venv

The numpy/environment error might be related to #1206 Rather than asking you to run this branch (which improves emep-reporting mem-usage), I wanted to ask you about testing the same idea in the cams2_83 reader, i.e. add chunks={"time": 24} to https://github.com/metno/pyaerocom/blob/26e8cdb79e7e0a99c13e370da51336cbbea44431/pyaerocom/io/cams2_83/reader.py#L193

I see, I will try that instead the issue I had is the same that you mentioned

the test with chunks={"time": 24} in the cams283's read_dataset used memory comparable with the production code
the running time was significantly shorter in a way that I am not sure we can attribute to this change only at this stage.. or maybe it's the perfect fit
anyway I think it can be safely implemented

heikoklein commented 3 months ago

@charlienegri Thanks, I added then the "chunk" line to the cams2_83/reader.py, too.