pypsa-meets-earth / pypsa-earth

PyPSA-Earth: A flexible Python-based open optimisation model to study energy system futures around the world.
https://pypsa-earth.readthedocs.io/en/latest/
218 stars 173 forks source link

Potential bug in network parameter, num_parallel or length #862

Open pz-max opened 12 months ago

pz-max commented 12 months ago

We thought the OSM data is doing quite well in EU looking at:

Reported by Tom B: PyPSA-Eur (entsoe) is MUCH closer to official statistics (which here include Turkey, unlike PyPSA-Eur): https://eepublicdownloads.entsoe.eu/clean-documents/Publications/Statistics/Factsheet/entsoe_sfs2021_web.pdf

PyPSA-Eur 159,000 for 380 kV is much closer to official 186,000 than OSM's 290,000. Ditto for 220+300: our 125,000 km is much closer to official 131,000 km than OSM's 930,000 km.

TODO: Two possible reasons for big OSM error:

ekatef commented 12 months ago

Thanks for sharing @pz-max!

Agree that the result is quite surprising. When doing grid validation for Central and Western Asia, we found some discrepancies, but they have never be as high as more than eight times...

My feeling is that it would be good: 1) to look into some more details of the discrepancies found for elec.nc trying to provide some additional insights by voltage classes and countries; 2) compare OSM-extracted lengths values after cleaning with ENTSO-extracted lengths..

pz-max commented 12 months ago

One extra point:

  1. Maybe not the correct .crs system was set in the PyPSA-Earth config... So elec.nc might be wrong to begin with
GbotemiB commented 11 months ago

Here is a follow on @ekatef suggestions.

Here is a voltage comparison plot between OSM and ENTSOE image

Using log transformation on the y-axis to scale the data image

Here is also a more detailed country stat comparison for Tw/Km image

Using log transformation on the y-axis to scale the data image

One of the things I noticed is that the country code with LU is missing data in the entsoe network.

pz-max commented 11 months ago

Thanks @GbotemiB , regarding units.

ekatef commented 11 months ago

@GbotemiB Amazing plots! ๐Ÿคฉ An interesting catch for Luxembourg. It seems form ENTSOE Factsheet that data for LU are included into ENTSO data. Not sure why we don't have data for it in PyPSA-Eur elec.nc. Good to be aware of this

@pz-max agree that comparison for line lengths would be highly interesting. My feeling is that is would be great if we could provide Emmanuel with clean_osm_data. Actually, I hope that inter-comparison results would also look nicer for lengths :D

GbotemiB commented 11 months ago

@pz-max @ekatef Here is the plot considering Kilometer for each voltage level image

Here is the same plot for better understanding image

image

pz-max commented 11 months ago

@GbotemiB can you create a repo with the notebook such that we can review the code easily? (Meaning the data needs to be uploaded somewhere for the notebook too)

ekatef commented 11 months ago

@GbotemiB Ouch... I'd say, the result is quite surprising ๐Ÿ˜„ Great to have cross-comparison

Agree with @pz-max that your comparison work would be a great contribution to documentation repo. Would you mind to fork it and create a PR with your notebook?

davide-f commented 11 months ago

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well. There we do some data filling that may be verified.

As test cases AT and MK may be good to test given the errors. I may try to share the entire resources folder for that that may support the investigation. Do you agree?

ekatef commented 11 months ago

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well. There we do some data filling that may be verified.

As test cases AT and MK may be good to test given the errors. I may try to share the entire resources folder for that that may support the investigation. Do you agree?

@davide-f my feeling is that it would be perfect :) There are still some some validations for cleaned OSM data, while not sure anybody looked into effects of the cleaning procedure itself.

@pz-max @GbotemiB What is your opinion on this?

davide-f commented 11 months ago

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well. There we do some data filling that may be verified. As test cases AT and MK may be good to test given the errors. I may try to share the entire resources folder for that that may support the investigation. Do you agree?

@davide-f my feeling is that it would be perfect :) There are still some some validations for cleaned OSM data, while not sure anybody looked into effects of the cleaning procedure itself.

@pz-max @GbotemiB What is your opinion on this?

Here you can find selected folders of "resources" for the selected countries: https://drive.google.com/drive/folders/18dV790r11hHKIwpbyDBaxMV4XbhBFQde?usp=drive_link

In particular, it contains folders shapes, osm and base_network that should be all that's needed. The config file is also included

ekatef commented 11 months ago

Here you can find selected folders of "resources" for the selected countries: https://drive.google.com/drive/folders/18dV790r11hHKIwpbyDBaxMV4XbhBFQde?usp=drive_link

In particular, it contains folders shapes, osm and base_network that should be all that's needed. The config file is also included

@davide-f Fantastic, thank you very much! ๐Ÿ˜„

GbotemiB commented 11 months ago

@pz-max @ekatef @davide-f

Here is a comparison between osm-raw, osm-clean and entsoe data for AT

image image image

Here are the corresponding plot for MK image image image

ekatef commented 11 months ago

@GbotemiB amazing result! ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰

My feeling is that the line length for ENTSO is lower than for OSM data due to the coastline paradox. Not sure yet which exactly implication does it has for power flow calculations.

Actually, openinframap also gives a complicated picture consistent with OSM map you provided. An eastern part of Austria as an example:

image
GbotemiB commented 11 months ago

After doing a bit of cleaning on the data with @ekatef .

The results: AT image image

MK image image

davide-f commented 11 months ago

The numbers to me do not look bad at all! For these cases, I think that we are definitely in tolerance (orange vs green) for MK and AT is similar.

To test if the issue is the spatial resolution, the geometries may be simplified using simplify (https://wichita.ogs.ou.edu/OpenLayers-2.12/examples/simplify-linestring.html) on the geometries using a tolerange similar to the one of entso-e and see if numbers match better.

Moreover, as a second comparison, may be good to check the total TW by voltage, calculated as: s_nom line , for lines beyond 10km. Note: each s_nom of each line already accounts for the num_parallel, so there is no need to multiply s_nom * num_parallel, otherwise we double count the number of parallel conductors

ekatef commented 11 months ago

The numbers to me do not look bad at all! For these cases, I think that we are definitely in tolerance (orange vs green) for MK and AT is similar.

To test if the issue is the spatial resolution, the geometries may be simplified using simplify (https://wichita.ogs.ou.edu/OpenLayers-2.12/examples/simplify-linestring.html) on the geometries using a tolerange similar to the one of entso-e and see if numbers match better.

Moreover, as a second comparison, may be good to check the total TW by voltage, calculated as: s_nom line , for lines beyond 10km. Note: each s_nom of each line already accounts for the num_parallel, so there is no need to multiply s_nom * num_parallel, otherwise we double count the number of parallel conductors

Thanks a lot @davide-f!

My feeling is that the hint with simplification may be very helpful ๐Ÿ™‚ It looks like Douglasโ€“Peucker is exactly what we need here. @GbotemiB this algorithm is available in geopandas as simplify method. Agree with Davide that it is a great idea to apply gpd.simplify() to cleaned OSM lines geometry and look how would it impact the comparison result.