Include functions from tudelft3d/3d-building-metrics

martinfleis commented 2 years ago

Recently, @liberostelios, @Athelena et al published a paper on 3D building metrics. The work is, in principle, expanding what we have in momepy in shape module. They released the code in the repo (https://github.com/tudelft3d/3d-building-metrics) and I reached out to find out if there is an interest to include their new metrics in momepy here https://github.com/tudelft3d/3d-building-metrics/issues/12.

The plan now is to have a look at the metrics implemented in their paper, compare to the set we have here and add all those that are missing. I think that it all happens in the shape.py module of their repository but please correct me if not.

There are a few things to consider though.

1) We have a new module in pysal/esda implementing shape metrics that are somewhat universal, not focusing necessarily on urban morphology. When I'll have time, momepy will simply import esda.shape within the shape module. That lead to a question whether some of your metrics shouldn't be contributed to esda rather than here.

2) This one is more an annoyance than a constraint - I'd like to change the API design from the current class based requiring you to do momepy.Linearity(gdf).series to get back an array you are interested. That is a bit annoying in Dask workflows and is overly complicated, so the idea is that we should return either numpy array or a pandas series directly, exactly as you do in your code. So I would probably propose to already follow the new API and not to try to mimic existing one.

TODO: go through the metrics and list those that are not already implemented in momepy.

liberostelios commented 2 years ago

Indeed, most metrics (at least the "complex" ones) are computed in shape_index.py. Just to add a bit to the discussion:

I suppose you are only interested in the 2D metrics? Then the existing dependencies should be (mostly) fine, as basically everything is done with shapely and numpy which you already use. And for 2D indices, shape_index.py should be more or less independent of the rest of the modules in the repository. For 3D, we use pyvista (and for one metric pymesh) which complicates things a lot with respect to dependencies. Many 3D operations, also, are done in other modules in the repository.
Most of the metrics we compute are based on a discretisation of their formal definition. So you always have to create a grid for them to be computed. I guess the interface should expect the grid size as an input. Is this compatible with the current API approach? I guess a reasonable default value makes sense, but it is important that people are aware of that and be able to configure the grid's resolution to balance performance/accuracy. Just something to think about.

martinfleis commented 2 years ago

I suppose you are only interested in the 2D metrics?

Not necessarily, I'd be happy to include 3D as well. I am aware that pyvista and pymesh may cause installation troubles but if we keep them as optional dependencies, I guess we're fine.

Is this compatible with the current API approach?

I would probably need to see an example here but can we just expose grid size as an argument? Or are there some related issue I don't see right now?

liberostelios commented 2 years ago

Not necessarily, I'd be happy to include 3D as well. I am aware that pyvista and pymesh may cause installation troubles but if we keep them as optional dependencies, I guess we're fine.

That's great! Keeping them as optional dependencies should be fine. There are, also, a few functions from other modules that will be needed but it all should be trivial to be resolved.

One thing to consider for 3D, though, is that it can be very computationally expensive, so it might be nice to make this clear to users in the documentation. But I am getting ahead of myself. Let's port things, first!

I would probably need to see an example here but can we just expose grid size as an argument? Or are there some related issue I don't see right now?

They already take the grid size as an argument, so it's all good. They can also take the grid points themselves as an argument, so that the user can create the grid once and reuse it multiple times (that is because it's relatively slow to build the grid every time, especially in 3D). I am not sure if you are interested in keeping this option.

martinfleis commented 2 years ago

They can also take the grid points themselves as an argument, so that the user can create the grid once and reuse it multiple times

That also sounds like a good option.

@liberostelios Could you try to make a first PR with some (easier) subset of your functions? We can resolve some open questions there on top of the code and then move onto the rest of the functions.

pysal / momepy

Include functions from tudelft3d/3d-building-metrics #372