nasa / EMIT-Data-Resources

This repository provides guides, short how-tos, and tutorials to help users access and work with data from the Earth Surface Mineral Dust Source Investigation (EMIT) mission.
Apache License 2.0
138 stars 81 forks source link

Fix handling for HTTPFileSystem URIs #24

Closed alexgleith closed 1 year ago

alexgleith commented 1 year ago

In order to load EMIT data over HTTP, it's best to use a token to authenticate using fsspec as documented in this comment.

This below is working code, but it requires this attached fix.

s3_url = "s3://lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230123T004529_2302216_003/EMIT_L2A_RFL_001_20230123T004529_2302216_003.nc"
s3_url = s3_url.replace("s3://", "https://data.lpdaac.earthdatacloud.nasa.gov/")

fs = HTTPFileSystem(headers={
    "Authorization": f"bearer {token}"
})
# ds = xr.open_dataset(fs.open(s3_url))
ds = emit_xarray(fs.open(s3_url))
ds
ebolch commented 1 year ago

Thanks @alexgleith

Glad you were able to get HTTPS access working. We are planning add some instructions for HTTPS access to a future notebook. For other users in the meantime, the easiest way to access and work with an EMIT granule via HTTPS would be something similar to what you posted above:

# Use earthaccess library to login and retrieve a token
earthaccess.login()

# Provide granule URL
url = 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc'

# Get an https fsspec session
fs = earthaccess.get_fsspec_https_session()

# open granule using xarray or emit_xarray function from emit_tools.py
with fs.open(url) as file:

    # Using xarray (only reads root group)
    dataset = xarray.open_dataset(file)

    # Or using the emit_xarray function
    dataset = emit_xarray(file, ortho=False)