google / Xee

An Xarray extension for Google Earth Engine
Apache License 2.0
240 stars 28 forks source link

Geometry bounds are updated with the projection. #144

Closed dabhicusp closed 5 months ago

dabhicusp commented 6 months ago

In the current implementation, we are utilizing geometry.bounds().getInfo() to retrieve the bound points. However, this approach yields inconsistent results when the latitude value of a point exceeds 104, resulting in the return of a constant latitude value. So I am adding the projection into the bound points to fix this issue.

Example:

ee_image = ee.Image("projects/anthromet-prod/assets/1km_uk_nimrod_composite_rainfall_radar/202202120220_nimrod_ng_radar_rainrate_composite_1km_UK")

scale = 5000
projection = ee.Projection("EPSG:4326", [1, 0, 0, 0, -1, 0]).atScale(scale)
ee_image = ee_image.reproject(projection)

point = ee.Geometry.Point(-110.2414893624401 ,104)
distance = 311.5
geometry = point.buffer(distance, proj=projection).bounds(proj=projection)

ic = ee.ImageCollection([ee_image])
ds = xarray.open_dataset(
    ic, projection=ee_image.projection(), geometry=geometry,
    ee_mask_value=-99999, engine = 'ee'
)
ds

The above ds returns only a single latitude point when utilizing the current code. However, upon updating the bounds with bounds(proj = projection), the ds accurately returns 623 latitude points.

Thank you @bbabenko for suggesting the above apporoach.

dabhicusp commented 6 months ago

@naschmitz Test cases for the above have been added.

lbferreira commented 6 months ago

I think that if the user don't provide the projection the code will break. I'm not sure, but I think I noted it when I was implementing a very similar change in a previous PR: https://github.com/google/Xee/pull/127 I was not able to add reviewers to my PR, I don't know why it was not reviewed.

dabhicusp commented 6 months ago

@lbferreira If user not passed any projection code still works but returns the wrong output which current code returns. So in that particular situation, would it be better to use the projection of the image or is the current implementation okay to use? What you say @naschmitz @jdbcode.

Code:

ee_image = ee.Image("projects/anthromet-prod/assets/1km_uk_nimrod_composite_rainfall_radar/202202120220_nimrod_ng_radar_rainrate_composite_1km_UK")
scale = 5000
projection = ee.Projection("EPSG:4326", [1, 0, 0, 0, -1, 0]).atScale(scale)
ee_image = ee_image.reproject(projection)
point = ee.Geometry.Point(-110.2414893624401 ,104)
distance = 311.5
geometry = point.buffer(distance, proj=projection).bounds(proj=projection)

# No projection
print(geometry.bounds(1, proj=None).getInfo())

# Projection from the image
print(geometry.bounds(1, proj=ee_image.projection()).getInfo())

# Projection passed explicitely.
print(geometry.bounds(1, proj=self.projection).getInfo())
jdbcode commented 6 months ago

I don't think we should make any changes based on this case.

  1. Latitude above 90 and below -90 may not be valid. When working in EPSG:4326 we should not expect EE or GDAL to handle them correctly.

  2. Setting projection on bounds() does not change the result. In your example, when you are setting the projection in point.buffer – that is what is changing the result.

  3. You getting 1 pixel back with the current implementation appears to be working as intended. Your geometry request is the bounds of a buffered point. When you call .buffer without a projection, the default unit is meters. In your example, prior to setting the buffer projection, 311.5 is interpreted as meters radius from which you get a ~622m x 622m bounding box, which is smaller than the 5000m x 5000m meter scale of the data - i.e. the buffered point region fits inside of a single pixel, so you only get one back. However, when you specify the projection in the buffer call as EPSG:4326, which has a unit of degrees, the buffer radius is now interpreted as 311 degrees, so the bounding box is 622 degrees by 622 degrees, which is wrapping around the globe almost twice giving a region that is probably not what you want.

dabhicusp commented 6 months ago

Hello, @jdbcode. I have encountered an additional practical example in which we required the use of geometry.bounds(1, proj=self.projection) rather than geometry.bounds().

image = ee.ImageCollection('NASA/GPM_L3/IMERG_V06').filterDate('2024-01-01', '2024-01-02').first().select('precipitationCal')
geometry = ee.Geometry.Rectangle([[-180, -80], [180, 80]],None,geodesic=False)

ic = ee.ImageCollection([image])
ds = xarray.open_dataset(ic, geometry=geometry, projection=image.projection(), engine=xee.EarthEngineBackendEntrypoint)
first_variable_values = ds['precipitationCal'].values[0]
plt.imshow(first_variable_values.T, vmin=0, vmax=2)
# shift by 180 degrees
image = ee.ImageCollection('NASA/GPM_L3/IMERG_V06').filterDate('2024-01-01', '2024-01-02').first().select('precipitationCal')
geometry = ee.Geometry.Rectangle([[0, -80], [360, 80]],None,geodesic=False)

ic = ee.ImageCollection([image])
ds = xarray.open_dataset(ic, geometry=geometry, projection=image.projection(), engine=xee.EarthEngineBackendEntrypoint)
first_variable_values = ds['precipitationCal'].values[0]
plt.imshow(first_variable_values.T, vmin=0, vmax=2)

When rotating the geometry by 180 degrees, the output data should also be rotated by 180 degrees. However, the current XEE code does not rotate the data, while the code from the PR does rotate the data and passes the above test case.

Xee's main branch's code:

image

Current PR's code:

image
jdbcode commented 6 months ago

geometry = ee.Geometry.Rectangle([[0, -80], [360, 80]],None,geodesic=False)

Geom constructors use EPSG:4326 as default CRS, which has longitude range of -180 to 180 with the prime meridian at 0. You are specifying geometry whose east limit is the prime meridan and wraps around the globe fully, crossing the antimeridian before meeting itself back at the prime meridian. Crossing the antimeridian is likely to give unexpected results. If you are trying to extract data in such a way that the center is at -180 (the antimeridian) you may need to download the data into two parts (east of the antimeridian and west of the antimeridian) and put them together client-side (see antimeridian cutting in the GeoJSON docs), or use a CRS whose center is the antimeridian (maybe something like EPSG:3832?).

jdbcode commented 5 months ago

I still worry that it will introduce coordinate precision issues that might affect some people. But if it fixes your case, that's great, and if it causes issues for other people then we'll have confirmed cases to help us figure out a better solution.