google / Xee

An Xarray extension for Google Earth Engine
Apache License 2.0
240 stars 28 forks source link

Xee and SSL #128

Closed ZZMitch closed 7 months ago

ZZMitch commented 7 months ago

Hello,

Firstly, thanks for a great product! I use both GEE and xarray a lot, so this is an exciting development for me.

I will preface my issue by stating I work on a government network that can make accessing cloud remote sensing datasets difficult. Usually GEE is no problem, but I have run into similar SSL issues with STAC access. However, I have been able to get around those, such as by verifying a certificate or disabling SSL certification... for example (loading a STAC xarray into memory with pyproj and stackstac):

with dask.diagnostics.ProgressBar():
    with rasterio.Env(GDAL_HTTP_UNSAFESSL = 'YES') as env: 
        data = aoi.compute()

Here is a basic example of the SSL issues I currently run into with Xee:

import ee
import xarray

ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')

ic = ee.ImageCollection('ECMWF/ERA5_LAND/HOURLY').filterDate('1992-10-05', '1993-03-31')
ds = xarray.open_dataset(ic, engine='ee', crs='EPSG:4326', scale=1)

ds_mon = ds.groupby(ds.time.dt.month).mean('time')

On the last line, I get this error:

SSLError: HTTPSConnectionPool(host='earthengine-highvolume.googleapis.com', port=443): Max retries exceeded with url: /v1/projects/earthengine-legacy/image:computePixels (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')))

Any help on how to proceed?

naschmitz commented 7 months ago

Hi Mitchell, thanks for the detailed writeup.

One thing you could try is disabling SSL cert validation on the HTTP client passed into ee.Initialize.

import httplib2

http_transport = httplib2.Http(disable_ssl_certificate_validation=True)
ee.Initialize(
    opt_url='https://earthengine-highvolume.googleapis.com',
    http_transport=http_transport)

Let me know if that works.

naschmitz commented 7 months ago

I'm going to close this issue. Please reopen if you have any more issues!

ZZMitch commented 7 months ago

Hey @naschmitz, sorry for the late response - I was not able to get to this yesterday. Thanks for your suggestion!

I tried it out and it did have an impact, but I am not sure it solved the issue. Let me explain...

I first added your suggestion and re-ran the same code above. Rather than getting an SSL error, the final line ran for 5-10 minutes and then crashed my Jupyter kernel. Next, I tried making the problem smaller (reducing the ic to a single day and just getting the mean - rather than using group by). This worked: after about 2 minutes I received the 'ds_mon' varible. However, I also tested this change without your suggestion and it also worked (after restarting the kernel).

Let me know if you have any other ideas. I may also consider testing other scenarios (e.g., with a different image collection and xarray manipulations) and seeing if I get similar results.