opendatacube / odc-geo

GeoBox and geometry utilities extracted from datacube-core
https://odc-geo.readthedocs.io/en/latest/
Apache License 2.0
80 stars 12 forks source link

odc.geo does not correctly calculate/load a geobox for a 1 x 1 pixel data #107

Closed tebadi closed 1 year ago

tebadi commented 1 year ago

It appears that odc.geo does not correctly calculate a geobox for a 1 x 1 pixel data load (.odc.geobox is returned as None). If the pixel size is increased to two or more then it return the correct geometry (courtesy of @robbibt) :

assert data.odc.geobox.crs == data.geobox.crs assert data.odc.geobox.affine == data.geobox.affine

You can replicate this bug by running this notebook: https://bitbucket.org/geoscienceaustralia/cultivated_model/src/main/notebooks/extract_data_for_shp_custom.ipynb

The error occurs when calling collect_training_data function but it's thrown when running the following line:

https://github.com/GeoscienceAustralia/dea-notebooks/blob/c0c7e1d73576d48063ae40d345c6fb5ca7f047e8/Tools/dea_tools/spatial.py#L276

robbibt commented 1 year ago

Small example here - essentially the standard datacube .geobox can handle datasets with a 1 x 1 pixel dimensions, but the .odc.geobox returns None.

Being able to reliably return a Geobox for data returned by a point geom query is important, as point data extractions are common in machine learning training data workflows.

import datacube
from shapely.geometry import Point
from datacube.utils import geometry
import odc.geo.xr

dc = datacube.Datacube()

geom_point = geometry.Geometry(geom=Point(116.604072, -31.468834), crs="EPSG:4326")
geom_buffer = geom_point.buffer(0.0001)

ds_point = dc.load(product="ga_ls5t_nbart_gm_cyear_3",
        geopolygon=geom_point,
        time="2010")

ds_buffer = dc.load(product="ga_ls5t_nbart_gm_cyear_3",
        geopolygon=geom_buffer,
        time="2010")

print("Dimensions: ", ds_point.dims, ".geobox: ", type(ds_point.geobox))
print("Dimensions: ", ds_buffer.dims, ".geobox: ", type(ds_buffer.geobox))

print("Dimensions: ", ds_point.dims, ".odc.geobox: ", type(ds_point.odc.geobox))
print("Dimensions: ", ds_buffer.dims, ".odc.geobox: ", type(ds_buffer.odc.geobox))

Dimensions: Frozen({'time': 1, 'y': 1, 'x': 1}) .geobox: <class 'datacube.utils.geometry._base.GeoBox'> Dimensions: Frozen({'time': 1, 'y': 2, 'x': 2}) .geobox: <class 'datacube.utils.geometry._base.GeoBox'> Dimensions: Frozen({'time': 1, 'y': 1, 'x': 1}) .odc.geobox: <class 'NoneType'> Dimensions: Frozen({'time': 1, 'y': 2, 'x': 2}) .odc.geobox: <class 'odc.geo.geobox.GeoBox'>

Kirill888 commented 1 year ago

Transform is computed from coordinates, but it needs at least two elements along both dimensions to compute it. The fallback is to look at the original transform used when image was first created and extract resolution part from there. I think if you were to use odc.geo.wrap_xr or odc.geo.xr_coords to construct xarray coordinates inside datacube it would add needed attributes into xarray to be able to guess correct resolution for a 1x1 pixel array.

I believe it just adds resolution attribute to x and y coordinates

Kirill888 commented 1 year ago

Rasters with one of the spatial dimensions being single pixel wide can only be interpreted by odc-geo is they have necessary metadata recorded, this is done by odc-geo library, but clearly not done by datacube.

robbibt commented 1 year ago

Transform is computed from coordinates, but it needs at least two elements along both dimensions to compute it. The fallback is to look at the original transform used when image was first created and extract resolution part from there. I think if you were to use odc.geo.wrap_xr or odc.geo.xr_coords to construct xarray coordinates inside datacube it would add needed attributes into xarray to be able to guess correct resolution for a 1x1 pixel array.

I believe it just adds resolution attribute to x and y coordinates

Hey @Ariana-B @SpacemanPaul - how difficult would this be to implement in datacube 1.9? The single pixel use case is an important one for machine learning/pixel drill applications, so if there's existing tools available in odc-geo to improve this it'd be great to use them in datacube-core

Kirill888 commented 1 year ago

To be honest @robbibt I'm not sure if the issue is with datacube.load or with further processing that happens afterwards, also what version of odc-geo?