oemof / DHNx

District heating system optimisation and simulation models
MIT License
27 stars 12 forks source link

GeoDataFrame instead of DataFrame for ThermalNetwork.components #32

Open joroeder opened 4 years ago

joroeder commented 4 years ago

I am wondering whether we can directly use GeoDataFrames for storing all geo-referenced content of the ThermalNetwork.components.

Then, we do not need to care about coordinate reference system of lat, lon columns of the components, and we have already many import, export and plot functionalities of geopandas ... especialley for the import, I think it is very pratical. In most cases, you probably have some GIS files in geojson, shp or any other format ...

We could then define the geometric format of the geometry for each componet, e.g.: consumers: points forks: points producers: points edges: lines (no multilines) ...

What do you think? Is there anything against it?

jnnr commented 4 years ago

Thanks for this issue! Being able to use information about the geographic location of consumers, producers, forks etc. and geometries of edges was part of the idea of dhnx, so this is a natural thing to ask. In some use cases you might not have these informations, but if you have them you would want to use them.

At the moment, ThermalNetwork only has a CSVImporter which allows you to define a network consisting of different nodes and its connectivity. It is assumed that the lengths of the edges are know, and (even this is optional), the geocoordinates. In this case, pandas DataFrames are a good choice.

If you have the information (e.g. by downloading it from osm, as shown in the import_osm example), it is more convenient to store it as geopandas.GeoDataFrame. GeoDataFrame inherits from pandas.DataFrame, adding a 'geometry' column which holds data about POINTs and LINESTRINGs. Also, GeoDataFrames have more methods that handle the geographic data.

It makes sense to use geopandas when detaiiled geo information is present and use pandas when this is not the case. The class ThermalNetwork should be able to handle both DataFrames and GeoDataFrames polymorphically.

How to do it? When writing a new ShapefileImporter or GeojsonImporter, use geopandas. Also, we might have to rewrite this here https://github.com/oemof/DHNx/blob/dev/dhnx/network.py#L70, where the component dataframes are initialized as empty pandas.DataFrames. If we can find a way to leave these specifics in the importers/exporters, this should work.

joroeder commented 4 years ago

Thanks for your comment! From my perspective, not having GIS data when doing any dhs optimization/calculation but having csv data, is a rare case. And if you have csv data with coordinates, it is really easy to generate a geoDataFrame. So we could focus on that as well.

The class ThermalNetwork should be able to handle both DataFrames and GeoDataFrames polymorphically.

Sounds nice! But I don't know how to do that ..

If we can find a way to leave these specifics in the importers/exporters, this should work.

Maybe, I don't know. I did not fully get the concept of the import,export structure so far, and why we need so many classes for that. But for sure, I am not a python native, maybe, all this makes sense 😉

Generally, I am afraid, that we are trying to consider too much right from the start. And thereby, we make it too complex, so that the developments are more difficult, and in the end very slowly. For me, the structure is already quite complex, e.g. when I now want to write a geojson reader, it would be faster for me, to write a function, which exports the geojson files, I want to import, into the given .csv structure and then import it 😉

jnnr commented 4 years ago

I can open a new branch to try the solution that I described above. Coming soon.

joroeder commented 3 years ago

I can open a new branch to try the solution that I described above. Coming soon.

For me, this issue is solved. I am fine having DataFrames in the thermal network. This might makes things easier as we discussed once. Did you try this solution you were talking of? In my opinion, we can close this issue.

jnnr commented 3 years ago

I would like to leave this open. Let's check the integration with the OSMImporter and see how things work when using GeoDataFrames.

joroeder commented 3 years ago

Hey, it seems that there is no problem using geopandas.GeoDataFrame in the components - which is very nice I think!

network = dhnx.network.ThermalNetwork()
network.components['pipes'] = geopandas_dataframe_pipes
network.components['forks'] = ...
network.components['consumers'] = ...
network.components['producers'] = ...

network.is_consistent()

Is it indented to work like this without any "setter" method? Or can you also add Dataframes to the components with the ThermalNetwork.add() method?

Here is an example: https://github.com/oemof/DHNx/blob/features/Move_gistools_to_dhnx/examples/investment_optimisation/import_osm_invest/import_osm_invest.py