google / Xee

An Xarray extension for Google Earth Engine
Apache License 2.0
240 stars 28 forks source link

Bounds filter fix #127

Open lbferreira opened 7 months ago

lbferreira commented 7 months ago

I also observed the same issue described by @giswqs in #118 related to the output raster with some zero-value on the image edge, resulting in some black strips, If the image footprint image.geometry() is used as the geometry.

I reproduced @giswqs issue and also tested using different image source and geometry. In all cases, the output raster has some weird patterns. Below I present some images showing the problem. After investigating, I realized that this problem was caused because, based on the geometry provided, the current code gets the bounds in EPSG:4326 and then convert the bounds to the target CRS provided by the user. It's problematic because it can result in some deformations. Thus, I change the logic to get the bounds directly in the target CRS provided by the user. In summary, instead of doing geometry.bounds() and later converting the bounds, I do geometry.bounds(maxError=1, proj=target_crs). I'm not sure if 1 is a good value for maxError but it worked well in my tests. If someone, know a better value for maxError, please let me know. At the end I present some images to show the raster after this fix.

Current problems

ee_image = ee.Image("GOOGLE/DYNAMICWORLD/V1/20220119T185709_20220119T185710_T10SFG")
ic = ee.ImageCollection(ee_image)
ds = xarray.open_dataset(ic, crs='EPSG:32610', scale=50, engine='ee', geometry=ee_image.geometry())
image = ds.isel(time=0).rename({'Y': 'y', 'X': 'x'}).transpose('y', 'x').rio.write_crs("EPSG:32610")
image['label'].plot.imshow(figsize=(10, 10))

dw_example_original

ee_image = ee.Image("GOOGLE/DYNAMICWORLD/V1/20220119T185709_20220119T185710_T10SFG")
ic = ee.ImageCollection(ee_image)

roi_coords = [(-121.1100533594971, 37.64753404740281),
 (-121.10776475768417, 37.73761854601581),
 (-121.22120321499558, 37.73938555546548),
 (-121.22335479806203, 37.649295359770846),
 (-121.1100533594971, 37.64753404740281)]
roi_crs = 'EPSG:4326'
bbox_geometry = ee.Geometry.Polygon(roi_coords, proj=roi_crs)

ds = xarray.open_dataset(ic, crs='EPSG:32610', scale=10, engine='ee', geometry=bbox_geometry)
image = ds.isel(time=0).rename({'Y': 'y', 'X': 'x'}).transpose('y', 'x').rio.write_crs("EPSG:32610")

image['label'].plot.imshow(figsize=(10, 10))
original_geometry_utm = gpd.GeoDataFrame(
    {"geometry": [Polygon(roi_coords)]}, crs='EPSG:4326'
    ).to_crs('EPSG:32610')
original_geometry_utm.plot(ax=plt.gca(), edgecolor='red', facecolor='none', linewidth=4)

dw_custom_crop_example_original

After the proposed FIX

ls_example_fix dw_example_fix dw_custom_crop_example_fix

The described fix was implemented for 2 of 3 possible cases handling geometry. According to the docs: "geometry (optional): Specify an ee.Geometry to define the regional bounds when opening the data. When not set, the bounds are defined by the CRS's 'area_of_useboundaries. If those aren't present, the bounds re derived from the geometry of the first image of the collection." The proposed fix is applied only for the geometry provided by the user or obtained in the first image. if the bounds are defined by the CRS's 'area_of_use, the current behavior was kept (get bounds and later reproject them).