natcap / invest

InVEST®: models that map and value the goods and services from nature that sustain and fulfill human life.
Apache License 2.0
159 stars 65 forks source link

CV - accept raster format habitat layers #581

Closed davemfish closed 2 years ago

davemfish commented 3 years ago

This was requested by @cybersea and it sounds reasonable to me.

In the past it seemed most typical for coastal & marine habitat data to originate in vector format, so that was the primary case we designed for. In recent times, with advances in remote sensing tech, it's increasingly common for original data sources to be rasters.

If we can, could we document some raster-format data sources that users are likely to want to use before we implement this?

cybersea commented 3 years ago

GeoTiff seems to be the most predominant format I run across. Also, potentially ERDAS Imagine (.img), ArcInfo GRID, ASCII Grid, ENVI. But I think GeoTIFF is by far the most common.

Uh -- or did you mean some example data sources?

davemfish commented 3 years ago

Uh -- or did you mean some example data sources?

This is what I was after - these are great examples, thanks!

davemfish commented 3 years ago

Here's an example of a really problematic vector that was generated from a high-res raster using gdal.Polygonize. Cases like this are a good argument for just accepting the raster dataset as input.

monster

This is one huge feature with invalid geometries. coastal_vulnerability.search_for_habitat valiantly fixes the broken geometry with a buffer(0) so that it can do a subsequent Intersection, but the buffer takes ~22 minutes on my desktop.

Here's some detail on the invalid geometries. Patterns like this are really common for vector data that was converted from high-res raster.

detail

So either the user needs to do some painful pre-processing, or needs to wait for for long model runtimes and have plenty of available RAM.

davemfish commented 3 years ago

@cybersea is it safe to assume that raster habitat datasets mainly represent one habitat per dataset? Or are they sometimes more like a traditional "LULC" - i.e. a categorical raster with many different habitat types represented in one band?

cybersea commented 3 years ago

They could be either. The benthic habitat layer has multiple classes, more like a typical LULC layer. Although we aren't using all of the classes and we're grouping some of the classes as well. The mangrove layers would be the one-habitat-per-dataset type.

davemfish commented 3 years ago

Hmm then it's probably easier for users to prep data into individual rasters - reclassing a "LULC-style" raster multiple times as needed - rather than merging individual rasters into one single band with accompanying biophysical table. Plus, unlike on land, multiple marine habitats can occupy the same pixel.

davemfish commented 2 years ago

Fixed by #734