Open gerritholl opened 4 weeks ago
I think fill values don't mean anything after palettize. I agree something has to happen with _FillValue
(maybe remove it), but doesn't palettize force all inputs to some color? The other option is a PA (versus a P) image where the Alpha channel holds the fill mask.
palettize
forces all inputs to some color, including inputs for "missing data". That does not rule out defining in the resulting image that missing data have pixel value 0 or 255 or something else, like we do for single channel mode L images. A client can treat those pixels differently on top of the colour displayed. For example, NinJo will show a tooltip with the pixel value for all valid pixels, but not for pixels with a value equal to fill value.
Independently of that question, it would be desirable if the user could control what value nan gets after applying palettize. That they get value 255 is ultimately due to numpy.digitize
putting nan in the final bin for being outside the range, but from a user's perspective, invalid pixels remain invalid pixels.
I'm not sure if nan
is guaranteed to go into the final bin or if this is an accident of implementation. If it is not guaranteed, there is even more reason to try to control this better from trollimage.
I think nan
is guaranteed to go to the final bin, because nan
is sorted to the end. So we could optionally set the fill value of the palettize result to the value of the final bin.
I would propose to (optionally?) in XRImage.palettize
set the fill value of the result to colormap.values.shape[0]-1
, which is where nan (as well as inf and valid out-of-range values) goes. I think that in this case, satpy's save_dataset(..., fill_value=x)
would result in those being written with value x
to the output file.
Correct me if I misunderstood, but you're saying that right now invalid values or at least NaNs specifically are already resulting in the maximum/last color in the colormap. The main issue is that _FillValue
is no longer in sync with the data it is attached to (in .attrs
). It seems reasonable to update the _FillValue
as you say to point to the last index.
This gets weird with some cases like category products where 0 (or other value) are known to be invalid and represent invalid. In those cases I think the assumption is that the user or rather the person who created the enhancement configuration knew this and the appropriate color in the colormap for that value represents "invalid" to them. But in these existing cases I don't think there is any indication of what is invalid or valid as far as geotiff metadata.
Correct me if I misunderstood, but you're saying that right now invalid values or at least NaNs specifically are already resulting in the maximum/last color in the colormap.
Correct. This is guaranteed by nan being always sorted after any finite values.
The main issue is that
_FillValue
is no longer in sync with the data it is attached to (in.attrs
). It seems reasonable to update the_FillValue
as you say to point to the last index.
Yes.
Downside: the last index also represents valid values close to the maximum of the valid range. Those get mixed already now, but when we don't have any invalid values, the user might not want to define any valid fill value, in order not to lose any valid data.
Describe the bug
XRImage.palettize
changes the dtype of the data to int64, but if the input data had a fill value defined, this is not changed. This may result in a fill value that is inconsistent with the dtype.To Reproduce
Expected behavior
I expect that the fill value is of the same dtype as the data.
Actual results
int64 nan
Environment Info:
Additional context
This may pose a problem in Satpy when loading the
cloud_top_height
composite with thenwcsaf-geo
reader and writing it out as a mode P image. Fill values in the input will be written with pixel value 255, but thefill_value
argument inget_enhanced_image
will have no influence and the fill value in the metadata of the produced image will not correspond to the actual fill value used in the input data.