Closed tebadi closed 1 year ago
Small example here - essentially the standard datacube .geobox
can handle datasets with a 1 x 1 pixel dimensions, but the .odc.geobox
returns None.
Being able to reliably return a Geobox for data returned by a point geom query is important, as point data extractions are common in machine learning training data workflows.
import datacube
from shapely.geometry import Point
from datacube.utils import geometry
import odc.geo.xr
dc = datacube.Datacube()
geom_point = geometry.Geometry(geom=Point(116.604072, -31.468834), crs="EPSG:4326")
geom_buffer = geom_point.buffer(0.0001)
ds_point = dc.load(product="ga_ls5t_nbart_gm_cyear_3",
geopolygon=geom_point,
time="2010")
ds_buffer = dc.load(product="ga_ls5t_nbart_gm_cyear_3",
geopolygon=geom_buffer,
time="2010")
print("Dimensions: ", ds_point.dims, ".geobox: ", type(ds_point.geobox))
print("Dimensions: ", ds_buffer.dims, ".geobox: ", type(ds_buffer.geobox))
print("Dimensions: ", ds_point.dims, ".odc.geobox: ", type(ds_point.odc.geobox))
print("Dimensions: ", ds_buffer.dims, ".odc.geobox: ", type(ds_buffer.odc.geobox))
Dimensions: Frozen({'time': 1, 'y': 1, 'x': 1}) .geobox: <class 'datacube.utils.geometry._base.GeoBox'> Dimensions: Frozen({'time': 1, 'y': 2, 'x': 2}) .geobox: <class 'datacube.utils.geometry._base.GeoBox'> Dimensions: Frozen({'time': 1, 'y': 1, 'x': 1}) .odc.geobox: <class 'NoneType'> Dimensions: Frozen({'time': 1, 'y': 2, 'x': 2}) .odc.geobox: <class 'odc.geo.geobox.GeoBox'>
Transform is computed from coordinates, but it needs at least two elements along both dimensions to compute it. The fallback is to look at the original transform used when image was first created and extract resolution part from there. I think if you were to use odc.geo.wrap_xr or odc.geo.xr_coords to construct xarray coordinates inside datacube it would add needed attributes into xarray to be able to guess correct resolution for a 1x1 pixel array.
I believe it just adds resolution
attribute to x and y coordinates
Rasters with one of the spatial dimensions being single pixel wide can only be interpreted by odc-geo
is they have necessary metadata recorded, this is done by odc-geo
library, but clearly not done by datacube
.
Transform is computed from coordinates, but it needs at least two elements along both dimensions to compute it. The fallback is to look at the original transform used when image was first created and extract resolution part from there. I think if you were to use odc.geo.wrap_xr or odc.geo.xr_coords to construct xarray coordinates inside datacube it would add needed attributes into xarray to be able to guess correct resolution for a 1x1 pixel array.
I believe it just adds
resolution
attribute to x and y coordinates
Hey @Ariana-B @SpacemanPaul - how difficult would this be to implement in datacube
1.9? The single pixel use case is an important one for machine learning/pixel drill applications, so if there's existing tools available in odc-geo
to improve this it'd be great to use them in datacube-core
To be honest @robbibt I'm not sure if the issue is with datacube.load or with further processing that happens afterwards, also what version of odc-geo
?
It appears that odc.geo does not correctly calculate a geobox for a 1 x 1 pixel data load (
.odc.geobox
is returned as None). If the pixel size is increased to two or more then it return the correct geometry (courtesy of @robbibt) :assert data.odc.geobox.crs == data.geobox.crs assert data.odc.geobox.affine == data.geobox.affine
You can replicate this bug by running this notebook: https://bitbucket.org/geoscienceaustralia/cultivated_model/src/main/notebooks/extract_data_for_shp_custom.ipynb
The error occurs when calling
collect_training_data
function but it's thrown when running the following line:https://github.com/GeoscienceAustralia/dea-notebooks/blob/c0c7e1d73576d48063ae40d345c6fb5ca7f047e8/Tools/dea_tools/spatial.py#L276