e2nIEE / pandapower

Convenient Power System Modelling and Analysis based on PYPOWER and pandas
https://www.pandapower.org
Other
848 stars 478 forks source link

Why not using GeoPandas? #709

Closed marhofmann closed 5 months ago

marhofmann commented 4 years ago

I was using GeoPandas for some geospatial analysis and i liked it a lot! I was wondering why pandapower is based on pure pandas and not (also) on GeoPandas? Instead of having extra DataFrames for geocoordinates (like net.bus_geodata) a GeoDataframe could contain geocoordiantes and technical Data as well!

lthurner commented 4 years ago

We also use geopandas for our work. I believe we wanted to minimize dependencies, which is why we didn't include geopandas, which is not easily installable on windows. It wasn't even installable through Anaconda at the time, which made it very difficult to get. Now that it is installable on Anaconda, we could think about including it, at least as an option...

bergkvist commented 3 years ago

In case you want to create geopandas.GeoSeries from bus_geodata:

import pygeos
import geopandas as gpd
import numpy as np

def bus_geoseries(net):
    points = pygeos.points(np.array([ net.bus_geodata.x.values, net.bus_geodata.y.values ]).T)
    return gpd.GeoSeries(points, index=net.bus_geodata.index.values).rename('geometry')

From line geodata:

import shapely
import geopandas as gpd

def line_geoseries(net):
    lines = [shapely.geometry.LineString(coords) for coords in net.line_geodata.coords.values]
    return gpd.GeoSeries(lines, index=net.line_geodata.index.values).rename('geometry')

In case you only have bus_geodata, and not line_geodata, you can create linestrings by connecting the points:

import geopandas as gpd
import pandas as pd
import numpy as np
import pygeos

def line_geoseries(net):
    from_point = pd.merge(net.line.from_bus, net.bus_geodata, left_on='from_bus', right_index=True, how='left')[['x', 'y']]
    to_point = pd.merge(net.line.to_bus, net.bus_geodata, left_on='to_bus', right_index=True, how='left')[['x', 'y']]
    x = np.array([ from_point.x.values, to_point.x.values ]).T
    y = np.array([ from_point.y.values, to_point.y.values ]).T
    lines = pygeos.linestrings(x, y)
    return gpd.GeoSeries(lines, index=from_point.index.values).rename('geometry')
SteffenMeinecke commented 2 years ago

@jkisse and me recently talked about the suboptimal fact that in test_auxiliary.py several testing lines are skipped due to not including gepandas as dependancy -> one more point pro including geopandas to the requirements.

ascheidl commented 2 years ago

Actually, we once had the plan to remove the tables "bus_geodata" and "line_geodata" at all.

Instead, we introduce the column "geo" in bus and line (and maybe other element, too). This geo column would be of type object (string) and contain geojson objects.

If needed, helper functions allow to translate to and from GeoDataFrames.

Advantages:

The implementation of this already exists here at Fraunhofer (we use this already in some projects - and it works fine).

We should come up with a plan for the transition into pandapower. I think the reason this is not done yet is mainly because it touches a lot of code.

jkisse commented 2 years ago

I've added geopandas to the GitHub Actions build (manually) (test logs here). Now, fewer tests are skipped and it works fine, so I will merge it soon. Apparently there are some issues with Python 3.10 which might be connected to geopandas. I'd suggest that we don't make geopandas a "hard" requirement but rather introduce a new extra-requirement (e.g. "all" -> pip install pandapower[all]) which would include the comprehensive dependency list, including geopandas and all the plotting & testing dependencies.

ascheidl commented 2 years ago

like Leon said, we decided to avoid geopandas as a dependecy because it can be a pain to install the whole necessary stack

so please do not include it as a (hard) dependecy

jwiemer112 commented 2 years ago

So as I see it now, it would be the best to remove bus_geodata and line_geodata and introduce geojson ( 1. value is longitude, 2. value is latitude) as the new standard. Convenience functions allow the conversion to geodataframes and should be open source.

Necessary steps:

KS-HTK commented 1 year ago

I know this is a rather old issue, but just in case someone comes across this and would like to know how to convert geodata to gis.

But just to answer the first question in a more compact way. There are functions to convert geodata to geopandas geoDataFrames and back. (example code, not tested but it should be correct)

import pandapower.plotting.geo as geo
net = pandapower.networks.mv_oberrhein()
geo.convert_geodata_to_gis(net)
# now net.line_geodata and net.bus_geodata are geoDataFrames
geo.convert_gis_to_geodata(net)
# now they have been converted back

On the topic of geojson: There is currently a pr ( #1731 ) that supports exporting as geojson. It adds all attributes of 'bus' and 'line' tables to the properties of the geojson feature. This is usefull for visualizing/editing pandapower networks in qgis. see pandapower-qgis It may be possible to use this as a basis for a converter to update old networks if pandapower changes the way it handles geodata.

SteffenMeinecke commented 5 months ago

I close this issue due to the helpful comment of @KS-HTK and merged PR #1731.