PyPSA / pypsa-eur

PyPSA-Eur: A Sector-Coupled Open Optimisation Model of the European Energy System
https://pypsa-eur.readthedocs.io/
342 stars 242 forks source link

Excluding certain countries is no longer possible #1306

Closed koen-vg closed 1 month ago

koen-vg commented 1 month ago

Checklist

Describe the Bug

Running PyPSA-Eur with a subset of supported countries such that non-included countries are surrounded by included countries is no longer supported. You might think that that's not really a problem (why would you exclude those countries in the first place?) but in some cases, such as when optimising over weather years for which only synthetic load data is available, it become necessary to exclude Kosovo (for which there is no synthetic load data).

The root cause seems to be that when buses are loaded in base_network.py (in _load_buses), all buses that are located inside the "europe_shape" are included; however, this shape, due to the way it is constructed using the exterior function, can include more countries than those specified in the config (for example, Kosovo).

Somehow in the transition to OSM networks, the logic for excluding buses that are not in the specified list of countries seems to have been lost. Maybe in _set_countries_and_substations? (Note that buses from the pre-built OSM network already include country annotations, so this function seems less necessary anyway.) I'm not super familiar with the base_network rule, so I haven't quite figured it out.

Error Message

The actual errors arise only in cluster_network, when it turns out that base_s.nc includes buses in countries that are not given in the configuration.

Possible fix

I have fixed this on my end with a little bit of logic for excluding buses outside the configured countries as soon as possible, in _load_buses. The following changes do the trick:

@@ -135,7 +135,7 @@ def _find_closest_links(links, new_links, distance_upper_bound=1.5):
     )

-def _load_buses(buses, europe_shape, config):
+def _load_buses(buses, europe_shape, countries, config):
     buses = (
         pd.read_csv(
             buses,
@@ -161,6 +161,11 @@ def _load_buses(buses, europe_shape, config):
         lambda p: europe_shape_prepped.contains(Point(p)), axis=1
     )

+    if "country" in buses.columns:
+        buses_in_countries = buses.country.isin(countries)
+    else:
+        buses_in_countries = pd.Series(True, buses.index)
+
     v_nom_min = min(config["electricity"]["voltages"])
     v_nom_max = max(config["electricity"]["voltages"])

@@ -173,7 +178,7 @@ def _load_buses(buses, europe_shape, config):
     )

     logger.info(f"Removing buses outside of range AC {v_nom_min} - {v_nom_max} V")
-    return pd.DataFrame(buses.loc[buses_in_europe_b & buses_with_v_nom_to_keep_b])
+    return pd.DataFrame(buses.loc[buses_in_europe_b & buses_in_countries & buses_with_v_nom_to_keep_b])

 def _load_transformers(buses, transformers):
@@ -712,6 +717,7 @@ def base_network(
     europe_shape,
     country_shapes,
     offshore_shapes,
+    countries,
     parameter_corrections,
     config,
 ):
@@ -736,7 +742,7 @@ def base_network(
     )
     logger.info(logger_str)

-    buses = _load_buses(buses, europe_shape, config)
+    buses = _load_buses(buses, europe_shape, countries, config)
     transformers = _load_transformers(buses, transformers)
     lines = _load_lines(buses, lines)

@@ -1006,6 +1012,7 @@ if __name__ == "__main__":
         europe_shape,
         country_shapes,
         offshore_shapes,
+        countries,
         parameter_corrections,
         config,
     )

I'd be happy to submit this as a PR. However, as I said, this is not the side of PyPSA-Eur I've worked with the most, so maybe there are better solutions. @bobbyxng ?

bobbyxng commented 1 month ago

Hi @koen-vg !

Thanks for catching the bug! Your proposal also looks good to me, I only have a small proposal: Instead of:

if "country" in buses.columns:
    buses_in_countries = buses.country.isin(countries)
else:
    buses_in_countries = pd.Series(True, buses.index)

we could also do this - functionally it should be the same:

buses_in_countries = buses.get("country", pd.Series(True, buses.index)).isin(countries)

Will you open a PR for this? Thanks a lot!

Best, Bobby

koen-vg commented 1 month ago

Done! I tried to compact the if-statement a little, but I'm not convinced that your code snippet would work since the .isin(countries) could be applied to a series containing True values (whereas countries, of course, contains two-letter strings).

bobbyxng commented 1 month ago

Done! I tried to compact the if-statement a little, but I'm not convinced that your code snippet would work since the .isin(countries) could be applied to a series containing True values (whereas countries, of course, contains two-letter strings).

Thank you @koen-vg !