Closed cordmaur closed 5 months ago
@cordmaur thanks for the report, can you please provide a full error trace and calling code.
Alright so problem is here:
resolve_chunk_shape
should be called with the largest dtype
across all bands, not with dtype
coming from the user configuration that might be per-band or not even present. One can also use dtype=None
, it's only used by Dask to resolve "auto"
chunks.
There is currently a major refactor on the way, not sure if this will be addressed before that merges.
in the meantime you can use stac_cfg=
to patch data type information missing from the stac source, something like
sentinel-2-l2a: #< or whatever collection you are loading
assets:
"*": {data_type: uint16, nodata: 0}
SCL: {data_type: uint8, nodata: 0}
visual: {data_type: uint8, nodata: 0}
but as a python dict
, not yaml string.
This is a mechanism to patch missing raster extension metadata: https://github.com/stac-extensions/raster
@Kirill888 , thank you for the clarification.
The suggested snippet solved the problem:
stac_cfg = {
"sentinel-2-l2a": {
"assets": {
"*": {"data_type": "uint16", "nodata": 0},
"SCL": {"data_type": "uint8", "nodata": 0},
"visual": {"data_type": "uint8", "nodata": 0},
}
}
}
I'm closing the issue.
In the documentation of the
load
function, it says that the dtype can be specified per band.According to the docstring:
dtype: Union[DTypeLike, Dict[str, DTypeLike], None] = None,
, I'm creating a dtype like:However I'm getting an error:
ValueError: entry not a 2- or 3- tuple
Investigating the code, it seems the error is in the call of
_resolve_chunk_shape(len(tss), gbox, chunks, dtype)
. The normalize_chunks get just one dtype and it tries to cast it as anp.dtype
. That's where the error is being raised.I could workaround this issue by hard-coding just one dtype for this
_resolve_chunk_shape
function (fig below), but I don't know what this "chunk dtype" is meant to be to make final PR solution.My odc.stac is version
0.3.9
.