Closed simulkade closed 8 months ago
@simulkade, great to see this! Let me currently share some thoughts:
More information:
Interesting comparison, supporting anything other than osm data is currently outside the scope of this package. Ideally we need something like powerplanmatching for lines. The holy-grail for hv lines is probably a remote-sensing approach (? @pz-max ).
based on the graph it would seem that there are many more additional lines in osm. There is an implementation of filtering lines in pypsa-earth. Could it also be possible that ENTSO-E lines are simplified? In that case using the line simplifcation function with a certain threshold should produce a more similar looking graph.
Yes @mnm-matin , ENTSO-E lines are simplified because some TSOs are blocking the GIS exact location for "no-reason" in my perspective.
Thank you @pz-max and @mnm-matin for your helpful comments, and apologies for the late follow-up. Just a couple of additions to my first comment above:
I forgot to extract both power lines and cables from the OSM data. Now I have both with the following code:
import earth_osm.eo as eo
eo.save_osm_data(
primary_name = 'power',
region_list = ['denmark'],
feature_list = ['substation', 'line', 'cable'],
update = False,
mp = True,
data_dir = './earth_data',
out_format = ['csv'],
out_aggregate = False,
)
and the extracted figures look like this:
Thick black lines show cables and thinner lines show lines. I also used osmium
to extract the same data. Here's the code after a few iterations with co-pilot:
import osmium
import csv
import pandas as pd
class PowerLineHandler(osmium.SimpleHandler):
def __init__(self, output_file):
super().__init__()
self.output_file = output_file
self.csv_writer = csv.writer(open(output_file, 'w'))
self.csv_writer.writerow(['id', 'version', 'visible', 'timestamp', 'uid', 'user', 'changeset', 'latitude', 'longitude', 'type'])
def way(self, w):
if 'power' in w.tags:
if w.tags['power'] == 'line':
power_type = 'overhead'
for node in w.nodes:
location = osmium.osm.Location(node.lon, node.lat)
self.csv_writer.writerow([w.id, w.version, w.visible, w.timestamp, w.uid, w.user, w.changeset, location.lat, location.lon, power_type])
elif w.tags['power'] == 'cable':
power_type = 'underground'
for node in w.nodes:
location = osmium.osm.Location(node.lon, node.lat)
self.csv_writer.writerow([w.id, w.version, w.visible, w.timestamp, w.uid, w.user, w.changeset, location.lat, location.lon, power_type])
pbf_file = 'earth_data/pbf/denmark-latest.osm.pbf'
csv_file = 'dk_power_line.csv'
handler = PowerLineHandler(csv_file)
handler.apply_file(pbf_file, locations=True)
The figures look identical but there are subtle differences between them. Notably, some lines are missing in the data extracted by earth-osm
. I want to share these observations with you and ask your opinion about the data extraction methods above, whether they should lead to the same outcome, and -as always- the consistency issues.
@simulkade can you report which lines are missing? @mnm-matin mentioned if lines are dropped in earth-osm, you will get a log message which lines are dropped.
Generally, we use the same data source as osmium, so there should be no difference.
I have verified that the results of both osmium and the latest earth-osm version are the same. Thanks for bringing this to attention. The osmium part will be part of a unit test in future releases to ensure consistency.
Thanks @mnm-matin and @pz-max, and apologies for not coming back to this issue earlier.
Is your feature request related to a problem? Please describe. It is related to the consistency between the data downloaded from
openstreetmap
usingearth-osm
and the dataset of power lines inPyPSA-Eur
. Here is a visualization of the data for Denmark: Some of theLineString
s significantly differ from one database to another (blue line shows thePyPSA-Eur
data). I know that thePyPSA-Eur
data comes fromENTSO-E
database that is an "illustration" of the power lines but I can still see some major differences between the two (see e.g. the connectivity of lines located at the North-West of Denmark).Describe the solution you'd like Since some of these databases are available, it is useful to have them here for comparison.
Describe alternatives you've considered Having other databases in the package.