PyPSA / pypsa-eur

PyPSA-Eur: A Sector-Coupled Open Optimisation Model of the European Energy System
https://pypsa-eur.readthedocs.io/
314 stars 219 forks source link

`build_renewable_profile_offwind-dc` broken for manually built `europe-2013-era5` cutout #434

Open zoltanmaric opened 1 year ago

zoltanmaric commented 1 year ago

Checklist

Describe the Bug

Attempting to build renewable profile for off-shore wind DC with build_cutout: true in a freshly cloned pypsa-eur repo fails with ValueError: operands could not be broadcast together with shapes (247,221) (57,194)

snakemake -j 8 resources/profile_offwind-dc.nc

My configuration changes are shown below - I removed the sarah cutout, replaced its usages with the era5 cutout, and set build_cutout: true.

Configuration Changes

```diff diff --git a/config.default.yaml b/config.default.yaml index 0ec5e9f..2ac3bc0 100755 --- a/config.default.yaml +++ b/config.default.yaml @@ -31,8 +31,8 @@ enable: prepare_links_p_nom: false retrieve_databundle: true retrieve_cost_data: true - build_cutout: false - retrieve_cutout: true + build_cutout: true + retrieve_cutout: false build_natura_raster: false retrieve_natura_raster: true custom_busmap: false @@ -97,16 +97,6 @@ atlite: dx: 0.3 dy: 0.3 time: ['2013', '2013'] - europe-2013-sarah: - module: [sarah, era5] # in priority order - x: [-12., 45.] - y: [33., 65] - dx: 0.2 - dy: 0.2 - time: ['2013', '2013'] - sarah_interpolate: false - sarah_dir: - features: [influx, temperature] renewable: @@ -170,7 +160,7 @@ renewable: potential: simple # or conservative clip_p_max_pu: 1.e-2 solar: - cutout: europe-2013-sarah + cutout: europe-2013-era5 resource: method: pv panel: CSi ```

Error Message

The error occurs on the line exclusions = exclusions | masked_ in gis.py in atlite.

As far as I understand, the error message indicates that I'm attempting a matrix operation on matrices with incompatible sizes (presumably exclusions and masked have different sizes), but I don't understand why that happens - and why it works with a retrieved europe-2013-era5 cutout.

Full Error Message

```python [Mon Oct 24 13:31:21 2022] rule build_renewable_profiles: input: networks/base.nc, data/bundle/corine/g250_clc06_V18_5.tif, resources/natura.tiff, data/bundle/GEBCO_2014_2D.nc, resources/shipdensity_raster.nc, resources/country_shapes.geojson, resources/offshore_shapes.geojson, resources/regions_offshore.geojson, cutouts/europe-2013-era5.nc output: resources/profile_offwind-dc.nc log: logs/build_renewable_profile_offwind-dc.log jobid: 0 benchmark: benchmarks/build_renewable_profiles_offwind-dc reason: Missing output files: resources/profile_offwind-dc.nc; Input files updated by another job: data/bundle/corine/g250_clc06_V18_5.tif, cutouts/europe-2013-era5.nc, resources/natura.tiff, resources/regions_offshore.geojson, data/bundle/GEBCO_2014_2D.nc, resources/shipdensity_raster.nc, resources/country_shapes.geojson, resources/offshore_shapes.geojson, networks/base.nc wildcards: technology=offwind-dc threads: 4 resources: tmpdir=/var/folders/d2/_d71yg_j05gc_0p6ldvyf9580000gn/T, mem_mb=20000 INFO:__main__:correction_factor is set as 0.8855 INFO:__main__:Calculate landuse availabilities... multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar return list(map(*args)) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 534, in _process_func return shape_availability_reprojected(shapes.loc[[i]], *args)[0] File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 509, in shape_availability_reprojected masked, transform = shape_availability(geometry, excluder) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 463, in shape_availability exclusions = exclusions | masked_ ValueError: operands could not be broadcast together with shapes (247,221) (57,194) """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/zoltan/github/pypsa-bro/.snakemake/scripts/tmpyoo6zp0l.build_renewable_profiles.py", line 286, in availability = cutout.availabilitymatrix(regions, excluder, **kwargs) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 610, in compute_availabilitymatrix availability = list(pool.map(_process_func, shapes.index)) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar return list(map(*args)) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 534, in _process_func return shape_availability_reprojected(shapes.loc[[i]], *args)[0] File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 509, in shape_availability_reprojected masked, transform = shape_availability(geometry, excluder) File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/site-packages/atlite/gis.py", line 463, in shape_availability exclusions = exclusions | masked_ ValueError: operands could not be broadcast together with shapes (247,221) (57,194) [Mon Oct 24 13:31:49 2022] Error in rule build_renewable_profiles: jobid: 0 output: resources/profile_offwind-dc.nc log: logs/build_renewable_profile_offwind-dc.log (check log file(s) for error message) RuleException: CalledProcessErrorin line 377 of /Users/zoltan/github/pypsa-bro/Snakefile: Command 'set -euo pipefail; /Users/zoltan/mambaforge/envs/pypsa-eur/bin/python3.9 /Users/zoltan/github/pypsa-bro/.snakemake/scripts/tmpyoo6zp0l.build_renewable_profiles.py' returned non-zero exit status 1. File "/Users/zoltan/github/pypsa-bro/Snakefile", line 377, in __rule_build_renewable_profiles File "/Users/zoltan/mambaforge/envs/pypsa-eur/lib/python3.9/concurrent/futures/thread.py", line 58, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2022-10-24T132312.559320.snakemake.log ```

euronion commented 1 year ago

Does the issue only occur with resources/profile_offwind-dc.nc or does it also occur with the other technologies, e.g. resources/profile_offwind-ac.nc or resources/profile_solar.nc?

What is the spatial extent of the cutout you obtain using build_cutout?

Potentially related note: In #321 we build the cutouts using the config.yaml file from the PR (different from master) and updated the cutout data in zenodo.

euronion commented 1 year ago

And: You mention the repository was freshly cloned. Just double checking that the repository was *completly fresh?

You did not run any parts of the workflow prior in this folder? If you have outputs from build_ship_raster or build_natura_raster from prior workflow executions (where retrieve_cutout was used) then that can also problems.

That's a very common problem...

zoltanmaric commented 1 year ago

Thanks for following up @euronion.

Does the issue only occur with resources/profile_offwind-dc.nc or does it also occur with the other technologies, e.g. resources/profile_offwind-ac.nc or resources/profile_solar.nc?

It doesn't work for offwind-dc or offwind-ac, but it does work for onwind and solar!

And: You mention the repository was freshly cloned. Just double checking that the repository was *completly fresh?

You did not run any parts of the workflow prior in this folder? If you have outputs from build_ship_raster or build_natura_raster from prior workflow executions (where retrieve_cutout was used) then that can also problems.

Yes, I cloned into a completely new directory called pypsa-bro using the command

git clone git@github.com:pypsa/pypsa-eur.git pypsa-bro

But I did not rebuild a new conda environment, I used my existing conda environment from my "main" `pypsa-eur" clone.

zoltanmaric commented 1 year ago

What is the spatial extent of the cutout you obtain using build_cutout?

❯ python
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33)
[Clang 13.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import atlite
>>> cutout = atlite.Cutout(path='cutouts/europe-2013-era5.nc')
>>> cutout.bounds
array([-12.15,  32.85,  34.95,  72.15])
zoltanmaric commented 1 year ago

@euronion I think you're right on the money with that question. The retrieved and built cutouts have different extents:

>>> cutout_built = atlite.Cutout(path='cutouts/europe-2013-era5.nc')
>>> cutout_retrieved = atlite.Cutout(path='../pypsa-eur/cutouts/europe-2013-era5.nc')
>>> cutout_retrieved
<Cutout "europe-2013-era5">
 x = -16.50 ⟷ 40.50, dx = 0.30
 y = 32.70 ⟷ 75.00, dy = 0.30
 time = 2013-01-01 ⟷ 2013-12-31, dt = H
 module = era5
 prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']
>>> cutout_built
<Cutout "europe-2013-era5">
 x = -12.00 ⟷ 34.80, dx = 0.30
 y = 33.00 ⟷ 72.00, dy = 0.30
 time = 2013-01-01 ⟷ 2013-12-31, dt = H
 module = era5
 prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']

Do you know why that might be, and what causes offwind profiles to expect a different size?

euronion commented 1 year ago

offwind-dc and offwind-ac use additional datasets for determining the eligible areas for build-up compared to onwind and solar:

In Snakefile

https://github.com/PyPSA/pypsa-eur/blob/6d13b40ab76e63dbccdfd1212f487e13a392b3af/Snakefile#L344-L353

My guess would be that one of them has the incorrect shape leading to the error when atlite tries to evaluate area availability.

Could you try and remove ship_threshold from the config for offwind-dc and try it again: (Setting it to False is insufficient, the key needs to be completely removed from the config)

https://github.com/PyPSA/pypsa-eur/blob/6d13b40ab76e63dbccdfd1212f487e13a392b3af/config.default.yaml#L166-L167

If that still raises the error, try to also remove max_depth from the config.

zoltanmaric commented 1 year ago

I noticed that #321 removed the bounds parameters, which results in build_cutout inferring different bounds:

<Cutout "europe-2013-era5">
 x = -16.50 ⟷ 36.90, dx = 0.30
 y = 32.70 ⟷ 75.00, dy = 0.30
 time = 2013-01-01 ⟷ 2013-12-31, dt = H
 module = era5
 prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']

And more importantly - build_renewable_profiles_offwind_ac and all others now work! Would it make sense to just remove the bounds configs from master? Is there an advantage to having them?

zoltanmaric commented 1 year ago

For reference, here's a map of the 3 different bounds sets.

image

euronion commented 1 year ago

I noticed that #321 removed the bounds parameters, which results in build_cutout inferring different bounds:

<Cutout "europe-2013-era5">
 x = -16.50 ⟷ 36.90, dx = 0.30
 y = 32.70 ⟷ 75.00, dy = 0.30
 time = 2013-01-01 ⟷ 2013-12-31, dt = H
 module = era5
 prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']

Would it make sense to just remove the bounds configs from master? Is there an advantage to having them?

The #321 branch was a quick-and-dirty project where we also tried new things like removing the bounds. We didn't adapt all the changes back to the master branch. I don't remember the missing bounds creating any problems and it should be save to adapt. @FabianHofmann do you remember any issues from removing the bounds?

And more importantly - build_renewable_profiles_offwind_ac and all others now work!

You mean after removing the bounds from the config or after removing max_depth and ship_threshold?

zoltanmaric commented 1 year ago

You mean after removing the bounds from the config or after removing max_depth and ship_threshold?

After removing the bounds. I didn't try removing max_depth and ship_threshold yet because removing the bounds sounded like a more robust fix.

euronion commented 1 year ago

Interesting. I still don't understand what is the root of the issue.

Since you have two pypsa-eur folders at hand, could you compare the potentials for both offwind from profiles_offwind-<ac|dc>.nc of the retrieved cutouts with the potentials from the version with the build cutouts (pypsa-bro nice folder name btw)?

They should not be different.

zoltanmaric commented 1 year ago

Update: removing the ship_threshold config alone also "solves" the issue - ie. profile_offwind-ac is successfully built.

FabianHofmann commented 1 year ago

Hey you two, this seems really hard to track down. Removing the bounds in the config works without any known problems, so we could remove these. But it would not be a satisfying fix. It seems it has to do with the most nothern areas and the ship_treshold. @p-glaum any idea?

p-glaum commented 1 year ago

I guess it has something to do with the fact that the raster data from the shipping density and the actual cutout raster do not overlap. The shipping density raster is only from x: -16.8 to 40.6 and y: 75.3 to 32.4 and therefore a bit smaller along the x-axis. Normally, Atlite should be able to fill the missing area if the argument allow_no_overlap=True is given. But I got a similar error before with the Corine data. This could mean that the function to fill the missing part is not working 100%

euronion commented 1 year ago

@p-glaum for me the shipping raster is larger than the cutout. (both downloaded from zenodo). Do we use the same data?

> cutout.bounds
array([-16.65,  32.55,  40.65,  75.15])
> # raster bounds
array([-16.798,  32.4  ,  40.797,  75.295])

The error is still raised.

p-glaum commented 1 year ago

@euronion I was using my own cutout. So in your case, you were using the cutout and shipping raster from zenodo and it wasn't working?

euronion commented 1 year ago

No, I was comparing the cutout bounds with the shipping raster bounds (both from zenodo and build locally). In both cases the cutout is within the shipping raster.

Turns out the issue is not created by the rasters, but actually be the regions. A number of region shapes are outside the cutout bounds if the cutout is build locally with the bounds as specified by the config.default.yaml: grafik

grafik

If I remove these region shapes, then all scripts run without an issue. It also explains why removing the bounds from config.default.yaml works: In that case the workflow determines the cutout bounds based on the regions and makes sure the whole outer bounds of all regions combined are contained within the cutout.

  1. I don't think there is anything to be gained from digging deeper into when and how this problem started.
  2. I suggest we remove the bounds from config.default.yaml. Alternatively we should update them to match the bounds from the cutouts on zenodo.
  3. We could introduce a check in atlite to make sure shapes are contained within cutout here: https://github.com/PyPSA/atlite/blob/bfbf638cab9baa5149e244209f4780d4ccd24585/atlite/gis.py#L537-L539

Your thoughts @FabianHofmann @p-glaum ?

elkir commented 1 year ago

Hi, I've encountered this bug recently when updating from a modified v0.4.0 to v0.6.1 and was debugging it alongside too, trying to catch which shapes exactly clash inside the atlite.gis module, glad you got there first!

Just to put my two cents in, I found the dependency of the cutouts on the network base.nc quite problematic and disabled it in my snakemake, so I'm a bit worried if that will be necessary in future version.

Adding in years and other parameters as wildcards to the snakemake workflow, anything which requires to build whole copies of everything from base.nc up, and at the same time having the cutouts depend on the network, means one has to have multiple copies of these big files (biggest in the whole workflow) to keep snakemake happy.

The latest version now has support for new projects with an optional shared cutout folder, almost exactly what I've modified my workflow to, but it sacrifices all the snakemake functionality in the process. It's something that now has to be tracked manually.

I'm not sure what I think is the best way forward, but maybe if the spatial and temporal features of base.nc were decoupled, that would allow the snakemake graph to deal with this better.