google / Xee

An Xarray extension for Google Earth Engine
https://xee.rtfd.io
Apache License 2.0
251 stars 29 forks source link

Opening a dataset with a projected CRS does not convert the dataset's coordinates #96

Closed boothmanrylan closed 10 months ago

boothmanrylan commented 11 months ago

This seems to lead to errors later on when attempting to write results back to a geotiff using rioxarray.

Full notebook to reproduce: https://colab.research.google.com/drive/149vNtVZ0lGJn5PXxfn7oaTS1VoJX0Qec?usp=sharing

Opening a dataset without specifying a crs:

unprojected_ds = xr.open_dataset(
    col,
    engine="ee",
    scale=0.0025,  # degrees
    geometry=geometry,
)

And then writing to a geotiff using:

def to_raster(ds, fname, crs="epsg:32610", x_dim="X", y_dim="Y"):
    ds = ds.transpose(y_dim, x_dim)
    ds = ds.rio.set_spatial_dims(x_dim=x_dim, y_dim=y_dim)
    ds = ds.rio.write_crs(crs)
    ds = ds.rio.reproject(crs)
    ds.rio.to_raster(fname)

to_raster(
    unprojected_ds.mean(dim="time"),
    "unprojected_dataset.tif",
    crs="epsg:4326",
    x_dim="lon",
    y_dim="lat",
)

And then uploading that geotiff to earth engine produces an image that has the right dimensions and is located properly: xarray_unprojected_on_map

Opening a dataset with a specific crs:

ds = xr.open_dataset(
    col,
    engine="ee",
    scale=10,
    crs="epsg:32610",
    geometry=geometry,
)
to_raster(ds.mean(dim="time"), "dataset.tif")

Writing to geotiff and then uploading to earth engine produces an image that is located in the wrong location (over the equator in the pacific ocean, should be over san francisco), has the wrong dimensions (225x181 vs 203x206), and the wrong scale (0.000103m vs 10m): xarray_original

Transforming the coordinates of a dataset opened with a crs:

def transform_coords(dataset):
    x = dataset.coords["X"].values
    y = dataset.coords["Y"].values

    _x = np.linspace(np.min(x), np.max(x), y.shape[0])
    _y = np.linspace(np.min(y), np.max(y), x.shape[0])

    transformer = pyproj.Transformer.from_crs("epsg:4326", crs, always_xy=True)
    x_prime = transformer.transform(x, _y)[0]
    y_prime = transformer.transform(_x, y)[1]

    return dataset.assign_coords({"X": x_prime, "Y": y_prime})

transformed_ds = transform_coords(ds)
to_raster(transformed_ds.mean(dim="time"), "transformed_dataset.tif")

And then uploading to earth engine, produces an image that is properly located and has the right dimensions (the scale is slightly off, but I think that has to do with the hacky method I'm using to transform the coordinates here): xarray_transformed_on_map

dabhicusp commented 10 months ago

Writing to geotiff and then uploading to earth engine produces an image that is located in the wrong location

Hello @boothmanrylan you haven't passed EPSG:4326 into the to_raster.

ds = xr.open_dataset(
    col,
    engine="ee",
    scale=10,
    crs="epsg:32610",
    geometry=geometry,
)
to_raster(ds.mean(dim="time"), "dataset.tif")

Due to this, the raster created is in projection EPSG:32610. Furthermore raster created for unprojected_ds is in EPSG:4326. In reality images are plotted in the correct location as per its projection (plotting depends on image projection).

So If you pass EPSG:4326 into the to_raster for the above code then it will match with unprojected_ds raster file.

And we do agree that dataset's coordinates not comes into given crs(like for 'EPSG:32610' it should be in 'm') instead it's coming into degree always.