Closed RSchueder closed 1 year ago
@gjoseph92 Thanks very much for the initial review. I was wondering if you had any additional suggestions for this MR? Thanks!
@gjoseph92 comments round 2 addressed, except for the conflict with #208, which I will probably address after it is merged to master (assuming you would like to do #208 first).
@gjoseph92 sounds like a good plan, I have a functional test in my company repo that should validate this use case. I will install from main, demonstrate, and get back to you.
@gjoseph92 When installing this branch in my system, the following test passes:
from common.eo.catalog.eo_catalog import EOCatalog
from common.eo.catalog.stac.config import Catalog
from common.geometry import polygon_from_bounds
from common.types import EOProductType, TimeRange
def test_scale_offset():
aoi = [-97.8028, 49.8776, -97.1300, 50.0845]
toi = TimeRange(value='2021-12-01/2022-03-01')
bands = ["red"]
aoi = polygon_from_bounds(aoi)
client = EOCatalog(product_type=EOProductType('sentinel-2-l2a'), source=Catalog('skyfox'))
stac_items = client.search(
aoi=aoi,
toi=toi,
cloud_cover=100,
)
x_idx = 500
y_idx = 500
datacube = client.stack(stac_items[:1], bands, rescale=False)
unscaled_pixel = datacube[0, 0, y_idx, x_idx].values
assert unscaled_pixel == 9288
datacube = client.stack(stac_items[:1], bands, rescale=True)
scaled_pixel = datacube[0, 0, y_idx, x_idx].values
assert scaled_pixel == (unscaled_pixel * 0.0001) - 0.1
You can ignore the method for obtaining the items that constitute the content of the test aside from recognizing that they are sentinel-2-l2a
items that have scale and offset data in their metadata.
stackstac.stack
is called in client.stack()
in the following way
stack = stackstac.stack(
list(items),
assets=query_band_names,
epsg=modal_epsg_code,
resolution=product_type.RESOLUTION,
dtype=product_type.DTYPE if not rescale else 'float64',
fill_value=product_type.FILL_VALUE,
bounds_latlon=bounds_latlon,
rescale=rescale
)
If you want to run it yourself or encode it in the repo somehow I can serialize the items to omit the search component of the test.
What do you think of this?
@gjoseph92 I also wanted to point out that this is a breaking change for my existing codebase:
File "/usr/local/lib/python3.10/dist-packages/stackstac/rio_reader.py", line 408, in read
result *= scale
File "/usr/local/lib/python3.10/dist-packages/numpy/ma/core.py", line 4285, in __imul__
self._data.__imul__(np.where(self._mask, self.dtype.type(1),
numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'multiply' output from dtype('float64') to dtype('uint16') with casting rule 'same_kind'
I suspect that this is caused by me specifying dtype='uint16'
in stackstac.stack
, but the default for rescale
is True
, causing the float
scale
to be multiplied by the int
result
. What do you think about this behaviour?
What do you think of this?
Seems good! Thanks for testing it out.
I suspect that this is caused by me specifying
dtype='uint16'
instackstac.stack
, but the default forrescale
isTrue
, causing thefloat
scale
to be multiplied by theint
result
. What do you think about this behaviour?
Interesting. So right now, the doctoring for rescale
says: https://github.com/gjoseph92/stackstac/blob/7836a363b2609216890b0e20e289f8ff71cf320d/stackstac/stack.py#L207-L209
However, now that we know the scaling factors ahead of time, I think we can actually do better than this. If you specify dtyple=uint16
, and we know any scaling or offset factor is floating-point, we could raise an error about this during stack
explaining the problem and telling you to either set rescale=False
or change the dtype
. That would be much more helpful behavior, especially since this error from NumPy is hard to interpret.
Any interest in opening a PR to do that?
@gjoseph92 the PR for the fix you describe above has been created here: https://github.com/gjoseph92/stackstac/pull/209
FYI @RSchueder, this is now released in 0.5.0. Thanks for your contribution!
This MR enables users to set
rescale=True
instackstac.stack
and have thescale
andoffset
values be taken from the STAC Item with priority over the COG.