kraina-ai / srai

Spatial Representations for Artificial Intelligence - a Python library toolkit for geospatial machine learning focused on creating embeddings for downstream tasks
https://kraina-ai.github.io/srai/
Apache License 2.0
219 stars 16 forks source link

Add raster data loaders #364

Open RaczeQ opened 1 year ago

RaczeQ commented 1 year ago

Some data formats contain temperature, rainfall, elevation or other remotely sensed data in the form of georeferenced images with pixels representing values. We could benefit from those data sources if we could intersect them with our regions and extract numerical values such as median, min, max etc. There are existing libraries for this particular application and we could add those to srai with a new subgroup named raster.

Examples of those libraries:

JakubCha commented 8 months ago

I would also consider exactextract library. Recently, Python API was added to it. API is currently in beta stage and it's not published to PyPI or conda-forge yet. It will be added to package manager (conda-forge) once API is considered stable by the author.

The main benefit of this library is ability to use fractional cover of vector over pixel. Most libraries treat pixels inclusion into calculations in binary manner (it's in/out of vector) based on test of pixel center or whole pixel area being inside vector. Based on my experience (mainly in comparison with rasterstats) it's also much faster (at the same time I haven't prepared any reliable benchmark for that) than other libraries. Potential downside with exactextract that I see now is lack of possibility to calculate custom statistics. At the same time exactextract has considerable number of built-in statistics.

Another issue is to decide in which direction the srai should go - if you decide at some point to use big data engine like dask it might enforce usage of compatible library like geowombat or xvec. xvec developers also consider using exactextract as extraction engine.