pypsa-meets-earth / pypsa-earth

PyPSA-Earth: A flexible Python-based open optimisation model to study energy system futures around the world.
https://pypsa-earth.readthedocs.io/en/latest/
226 stars 177 forks source link

Clustering problems for regions with few buses/lines #554

Closed davide-f closed 1 year ago

davide-f commented 1 year ago

Checklist

Describe the Bug

When running the workflow for some countries, such as Rwanda, and relatively few clusters (5) or Gabon, errors in the workflow occur, such as the one mentioned here: https://github.com/PyPSA/PyPSA/issues/531 The simplification and clustering shall be revised

Error Message

If applicable, paste any terminal output to help illustrating your problem. In some cases it may also be useful to share your list of installed packages: conda list.

Exception has occurred: ValueError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
cannot insert bus1_s, already exists
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/pandas/core/frame.py", line 4814, in insert
    raise ValueError(f"cannot insert {column}, already exists")
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/pandas/core/frame.py", line 6358, in reset_index
    new_obj.insert(
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/pypsa/networkclustering.py", line 351, in get_buses_linemap_and_lines
    lines.reset_index()
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/site-packages/pypsa/networkclustering.py", line 378, in get_clustering_from_busmap
    buses, linemap, linemap_p, linemap_n, lines, lines_t = get_buses_linemap_and_lines(
  File "/data/davidef/git_world/pypsa-earth/scripts/cluster_network.py", line 444, in clustering_for_n_clusters
    clustering = get_clustering_from_busmap(
  File "/data/davidef/git_world/pypsa-earth/scripts/cluster_network.py", line 577, in <module>
    clustering = clustering_for_n_clusters(
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/davidef/miniconda3/envs/pypsa-earth/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
davide-f commented 1 year ago

With the new implementation of #632 this problem becomes easily more frequent; running NG withdefault dropping parameters leads with a single isolated bus left disconnected that gets disconnected and triggers this issue. This most likely needs a fix for https://github.com/PyPSA/PyPSA/issues/531

ekatef commented 1 year ago

I'm trying to reproduce an error for Rwanda with 5 clusters, but it works (config corresponds to config.default):

image

Although, there is a problem if to_substations: true a new issue appears in simplify_network which is connected with some issues in aggregate_to_substations. But it seems to be another story. I'm currently looking into it and will create an issue/PR once I'll be able to describe it properly

@davide-f, could you please give some more hints on how to reproduce exactly this issue: should it be NG with 5 clusters and p_threshold_drop_isolated: 1, p_threshold_merge_isolated: 10?

davide-f commented 1 year ago

Thanks katia! This problem arises when there are isolated nodes that are clustered together; small african countries are likely to fall into this case. When I rerun sum of them I may have more information on reprodubility. These issues may be triggered by not dropping/merging isolated nodes (reducing the thresholds). Also NG without dropping/merging isolated nodes may lead to this issue but needs to be tested. Probably, a larger number of clusters for NG while not dropping/merging buses may lead to the issue; you may test 10,20,30 nodes while not dropping nor merging any (using both p_drop/p_merge parameters = 0)

ekatef commented 1 year ago

@davide-f, super, thanks a lot for the detailed description!

Current reproducibility issues sound intriguing :) I'll experiment with it and hopefully we'll have some material to better capture and fix them

ekatef commented 1 year ago

This issue appears if the clustering algorithm obtains a set of isolated buses. That is caused by definition of interlines as interlines = lines.loc[lines["bus0_s"] != lines["bus1_s"]] in this code. Physical sense: there is at least one line which connect nodes mapped to different clusters by the clustering algorithm

The interlines dataframe might be empty only if each group of inter-connected nodes is clustered to exactly one cluster. The simplest way to reproduce this issue is to set number of cluster to 1. Another example is Rwanda with n_clusters 3 and not too strict conditions on drop/merge isolated nodes (let say, p_threshold_drop_isolated: 10, p_threshold_merge_isolated: 15). Then simplified network looks like that:

image

Apparently, the clustering algorithm with clusters: 3 is going to merge the whole network into a single node to satisfy the requested clusters number. Which leads to the issue.

As for reproducibility, it feels like a somewhat probabilistic output of clustering means difficulties in reproducing the results

ekatef commented 1 year ago

A solution would be probably to check busmap before calling get_clustering_from_busmap and implement some simplified algorithm to bypass this call in case busmap contains only isolated nodes

A case with links could need some additional attention as links seems to be treated by sub-networks in a bit special way

davide-f commented 1 year ago

Nice! :D I think could be nice if you could open a PR in PyPSA to address that issue :) The issue has been notified in https://github.com/PyPSA/PyPSA/issues/531

davide-f commented 1 year ago

This issue seems to be quite needed to fix as it is affecting the CI in this PR with a simple fix https://github.com/pypsa-meets-earth/pypsa-earth/pull/654

ekatef commented 1 year ago

This issue seems to be quite needed to fix as it is affecting the CI in this PR with a simple fix https://github.com/pypsa-meets-earth/pypsa-earth/pull/654

Thanks for notifying 🙂

ekatef commented 1 year ago

Have introduced a quick fix. It is functional but I'm not sure that is the best approach as it kind of justify having isolated clusters. My feeling is that it's worth to issue a warning at least if the network is reduced to some disconnected clusters. But it doesn't feel right to introduce too much changes into PyPSA source code for that

Probably, it would be better to move this fixing into simplify_network

ekatef commented 1 year ago

Testing on tutorial with adjusted drop and merge thresholds:

p_threshold_drop_isolated: 5
p_threshold_merge_isolated: 20

elec_s.ns

image

That allows to reproduce the error

ekatef commented 1 year ago

Currently, the fix has resolved the error "cannot insert {column}, already exists", but output of clustering looks like that:

image

It looks like the main network gets disconnected which needs some additional checking

ekatef commented 1 year ago

Update: troubles with network connectivity are not linked with the drafted PyPSA fix. Checked on an environment version which doesn't include PyPSA-fix

It looks like reducing the network for some disconnected nodes is a result of lower threshold set for drop and merge nodes. The "cannot-insert" issue appears if a number of clusters set to minimal possible value 9, if not PyPSA-fix is applied

Increasing a number of clusters resolve this issue as along with isolated buses, there is an additional network:

image

That corresponds to the main assumptions used for the fix, namely empty interlines dataframe

ekatef commented 1 year ago

This should be fixed with https://github.com/PyPSA/PyPSA/pull/599

davide-f commented 1 year ago

Closing this as it is already fixed into pypsa main and soon available into conda.