GispoCoding / eis_toolkit

Python library for mineral prospectivity mapping
https://eis-he.eu/
European Union Public License 1.2
18 stars 7 forks source link

Optimize Distance to anomaly #409

Open nmaarnio opened 4 months ago

nmaarnio commented 4 months ago

@nialov , would you have time to take this task? Based on recent feedback, distance to anomaly is requested to run much faster (also distance computation, #384 )

nialov commented 3 months ago

I believe the problem is geopandas.GeoDataFrame.unary_union here: https://github.com/GispoCoding/eis_toolkit/blob/5c8d32e2164853d8d48581f50529a78202123ad9/eis_toolkit/vector_processing/distance_computation.py#L81

Updates to geopandas and shapely might speed it up without code changes. I will try to check before the end of June.

If that does not solve the performance then you need to look at alternatives such as converting the geometries to raster cell values and calculating raster distances. The annoying thing is, that GDAL already implement the raster distance computations with high performance so you would just be replicating a GDAL function which you probably can not beat in terms of performance.

See: https://github.com/GispoCoding/eis_toolkit/pull/324#issuecomment-1943435164

nialov commented 1 month ago

There are a number of major updates that have not been implemented in eis_toolkit yet, e.g., shapely 2.0.0, geopandas 1.0.0 and numpy 2.0.0 to name a few. Getting these done successfully seems to require quite a lot of effort. If the updates are not needed, some other method for optimization will probably take less time.