ai4er-cdt / geograph

GeoGraph provides a tool for analysing habitat fragmentation and related problems in landscape ecology. GeoGraph builds a geospatially referenced graph from land cover or field survey data and enables graph-based landscape ecology analysis as well as interactive visualizations.
https://geograph.readthedocs.io
MIT License
39 stars 10 forks source link

Feature/more misc issues #82

Open herbiebradley opened 2 years ago

herbiebradley commented 2 years ago

This PR is for general improvements to GeoGraph necessary to run our case studies. So far, this PR contains code to improve the loading speed for all geographs and updates the pre-commit configuration file.

The loading speed improvement comes from two sources:

  1. Detecting if PyGEOS is installed and doing bulk queries of the spatial index accordingly. However, this has the drawback of causing a GDAL conflict with Shapely which slows down stuff like habitat calculations. Therefore I am not setting PyGEOS as a package requirement, and I simply provided a branch in the loading code if PyGEOS is installed. Fortunately, PyGEOS will very soon be integrated into Shapely in Shapely 2.0, which should give significant (probably >50%) reductions in loading time and significant benefits to other calculations.
  2. If PyGEOS is not installed, we attain around a 20% reduction in loading time by simply removing unnecessary node attributes which took some time to calculate and were rarely used.

I investigated the main bottlenecks in the most common graph operations, and despite guessing that the networkx graph library would be a potential source, I concluded that almost all of the code is bottlenecked by polygon and spatial index operations. Further speedups can mostly be gained from vectorising polygon operations (e.g. with PyGEOS), speed improvements in the underlying libraries like GDAL, and algorithmic improvements.

I also noticed significant performance improvements in all functions (around 20-30% reduction in computation time) from upgrading to Python 3.10 and the latest versions of rasterio, fiona, Shapely, and geopandas (mostly thanks to performance improvements in the underlying GDAL) - but the requirements file will be sorted out in a separate PR.

TODOs:

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB