pypsa-meets-earth / earth-osm

Export infrastructure data from OpenStreetMap using Python
https://pypsa-meets-earth.github.io/earth-osm/
MIT License
23 stars 12 forks source link

Consistency between power line datasets #44

Closed simulkade closed 8 months ago

simulkade commented 1 year ago

Is your feature request related to a problem? Please describe. It is related to the consistency between the data downloaded from openstreetmap using earth-osm and the dataset of power lines in PyPSA-Eur. Here is a visualization of the data for Denmark: image Some of the LineStrings significantly differ from one database to another (blue line shows the PyPSA-Eur data). I know that the PyPSA-Eur data comes from ENTSO-E database that is an "illustration" of the power lines but I can still see some major differences between the two (see e.g. the connectivity of lines located at the North-West of Denmark).

Describe the solution you'd like Since some of these databases are available, it is useful to have them here for comparison.

Describe alternatives you've considered Having other databases in the package.

pz-max commented 1 year ago

@simulkade, great to see this! Let me currently share some thoughts:

More information:

mnm-matin commented 1 year ago

Interesting comparison, supporting anything other than osm data is currently outside the scope of this package. Ideally we need something like powerplanmatching for lines. The holy-grail for hv lines is probably a remote-sensing approach (? @pz-max ).

based on the graph it would seem that there are many more additional lines in osm. There is an implementation of filtering lines in pypsa-earth. Could it also be possible that ENTSO-E lines are simplified? In that case using the line simplifcation function with a certain threshold should produce a more similar looking graph.

pz-max commented 1 year ago

Yes @mnm-matin , ENTSO-E lines are simplified because some TSOs are blocking the GIS exact location for "no-reason" in my perspective.

simulkade commented 1 year ago

Thank you @pz-max and @mnm-matin for your helpful comments, and apologies for the late follow-up. Just a couple of additions to my first comment above:
I forgot to extract both power lines and cables from the OSM data. Now I have both with the following code:

import earth_osm.eo as eo
eo.save_osm_data(
primary_name = 'power',
region_list = ['denmark'],
feature_list = ['substation', 'line', 'cable'],
update = False,
mp = True,
data_dir = './earth_data',
out_format = ['csv'],
out_aggregate = False,
)

and the extracted figures look like this: image

Thick black lines show cables and thinner lines show lines. I also used osmium to extract the same data. Here's the code after a few iterations with co-pilot:

import osmium
import csv
import pandas as pd

class PowerLineHandler(osmium.SimpleHandler):
    def __init__(self, output_file):
        super().__init__()
        self.output_file = output_file
        self.csv_writer = csv.writer(open(output_file, 'w'))
        self.csv_writer.writerow(['id', 'version', 'visible', 'timestamp', 'uid', 'user', 'changeset', 'latitude', 'longitude', 'type'])

    def way(self, w):
        if 'power' in w.tags:
            if w.tags['power'] == 'line':
                power_type = 'overhead'
                for node in w.nodes:
                    location = osmium.osm.Location(node.lon, node.lat)
                    self.csv_writer.writerow([w.id, w.version, w.visible, w.timestamp, w.uid, w.user, w.changeset, location.lat, location.lon, power_type])
            elif w.tags['power'] == 'cable':
                power_type = 'underground'
                for node in w.nodes:
                    location = osmium.osm.Location(node.lon, node.lat)
                    self.csv_writer.writerow([w.id, w.version, w.visible, w.timestamp, w.uid, w.user, w.changeset, location.lat, location.lon, power_type])

pbf_file = 'earth_data/pbf/denmark-latest.osm.pbf'
csv_file = 'dk_power_line.csv'
handler = PowerLineHandler(csv_file)
handler.apply_file(pbf_file, locations=True)

image

The figures look identical but there are subtle differences between them. Notably, some lines are missing in the data extracted by earth-osm. I want to share these observations with you and ask your opinion about the data extraction methods above, whether they should lead to the same outcome, and -as always- the consistency issues.

pz-max commented 1 year ago

@simulkade can you report which lines are missing? @mnm-matin mentioned if lines are dropped in earth-osm, you will get a log message which lines are dropped.

Generally, we use the same data source as osmium, so there should be no difference.

mnm-matin commented 8 months ago

I have verified that the results of both osmium and the latest earth-osm version are the same. Thanks for bringing this to attention. The osmium part will be part of a unit test in future releases to ensure consistency.

simulkade commented 8 months ago

Thanks @mnm-matin and @pz-max, and apologies for not coming back to this issue earlier.