Closed dionhaefner closed 6 years ago
Another thing:
this will be needed rather soon - the first version of terracotta supported this.
I can provide a case we can work on ;)
Actually quite a challenging problem. This will require another lengthy API planning session. Looking forward to it :wink:
This will require another lengthy API planning session. Looking forward to it 😉
Yes, you and @mrpgraae go into conclave and show some smoke when you found out...
There is no good way to implement categorical datasets in a way that Terracotta is agnostic about them. We will have to implement special cases and features for categorical datasets.
/legend
into /legend
and /colormap
/legend
should be renamed to /colormap
, since that is more descriptive of what the call actually returns.
A call to /legend/{keys}
shall henceforth return the names of the categories in a categorical dataset and their associated hex color string. Calling /legend
on a non-categorical dataset returns empty dict.
driver.insert
Add a new parameter called categories
which should be a list of Category
named tuples (could be dataclasses in the future). The Category
named tuple has 3 attributes:
value
: number-type of the raster value that represents the categorycolor
: 3-tuple of 0
..255
RGB valuesname
: str defining the name of the categoryA new column Categories
will be added to the database. The value will be a VARCHAR
containing a JSON encoding of the categories. For non-categorical datasets, this column will be null
. The presence of this defines whether or not a dataset is categorical.
We will need to add branches in the low-level functions to handle the categorical case:
When terracotta optimize-rasters
is used to cloud-optimize a raster, we should set a GeoTIFF tag specifying what resampling method was used for the overviews. We can then warn the user if they are trying to add a dataset as categorical, when they used something other than nearest
as resampling method.
We could allow users to not specify colors for the categories and then auto-generate a nice color cycle for them. This could be done with something like an np.linspace
index into the Viridis colormap.
Thoughts:
legend
API endpoint colormap
, since we already have colormaps
.categories
might be a better name than legend
, since it makes it clearer that it doesn't make sense to pass non-categorical data.metadata
, which shouldn't be much of an issue unless there are thousands of categories in an image.insert
API anymore. Since it doesn't make sense to specify colors for some pixel values, it could just accept two arguments (categories
and colormap
, for instance, where colormap
can either be the name of one that is built-in, or an actual mapping) More problems:
/rgb
doesn't make sense if one of the supplied datasets is categorical, so it should return an error. However, this is confusing from a front-end perspective: why do some datasets work and others don't? To find out which datasets are categorical, the user would have to call /metadata
on each one of them beforehand./singleband
would have to be ignored silently.Recipe to create categorical datasets:
[type, sensor, date, band]
type=categorical
, and other data with type=index
or type=reflectance
or whateverextra_metadata
, in the form of {category: pixel_value}
/datasets?type=categorical
/metadata
(includes ingested extra_metadata
)/singleband/categorical/S2/20180820/classification/{z}/{x}/{y}.png?colormap={pixel_value: color, ...}
(supplying mapping like this suppresses stretching and uses nearest resampling)/singleband?colormap=...
/datasets
)/rgb
and /singleband
without manual colormap and receive mild to extreme garbage/legend
)Whether we should go for this or not depends on how explicit we want to be in supporting categorical data. Is it a niche use case or a core feature? Can we afford to annoy the users a little with this somewhat hacky recipe?
Implemented. We'll see how this recipe works in practice. If it proves to be too cumbersome we can still introduce explicit support for categories by supplying them directly to driver.ingest
.
Challenges: