PyPSA / pypsa-eur

PyPSA-Eur: A Sector-Coupled Open Optimisation Model of the European Energy System
https://pypsa-eur.readthedocs.io/
323 stars 225 forks source link

Memory spikes x10 if shapes are in a network #1224

Open Irieo opened 1 month ago

Irieo commented 1 month ago

Checklist

Describe the Bug

PR https://github.com/PyPSA/pypsa-eur/pull/1013 introduced a new feature that shape files are stored inside network files (in addition to the .geojson-files that are stored in the resources/). This is convenient for plotting; however, for large networks, reading networks requires massive memory spike compared to previous versions w/o n.shapes.

For example, take a workflow for the 50 node electricity-only network with an up-to-date pypsa-eur. Let's pick build_powerplants rule from build_electricity.smk with default 7GB memory allocation: https://github.com/PyPSA/pypsa-eur/blob/885a881e7824f40b109faedfbf88b46dff9f462b/rules/build_electricity.smk#L31-L50

The script build_powerplants.py requires ~10.6GB of memory for the 50 node network, whereas profiling the same script w/o the line that reads base network show everything w/o reading network requires only ~2.2GB of memory. The 7GB memory legacy setting is thus not sufficient anymore, breaking the workflow with the default settings.

What's causing the memory spike?

If profiling the following test script shows that most of 10GB memory spike is needed in PyPSA/pypsa/io.py for the xarray call self.ds = xr.open_dataset(path)

import pypsa
n = pypsa.Network("resources/test-50/networks/base.nc")

image

Now, if we drop n.shapes, write to nc, and read again -> the same line requires 80x less memory (120 MB):

n.mremove("Shape", n.shapes.index)
n.export_to_netcdf("resources/test-50/networks/base_noshapes.nc")
n = pypsa.Network("resources/test-50/networks/base_noshapes.nc")

image

What can be done?

-- increase memory requirements within PyPSA-Eur and PyPSA-x (not ideal given the size of spikes) -- make n.shapes optional in config (trade-off between convenience and sanity) -- any workaround for xr.open_dataset(..)?

fneum commented 3 weeks ago

xref #1238