nasa / EMIT-Data-Resources

This repository provides guides, short how-tos, and tutorials to help users access and work with data from the Earth Surface Mineral Dust Source Investigation (EMIT) mission.
Apache License 2.0
132 stars 75 forks source link

geotransform and spatial_ref for orthorectification, not original data #43

Closed zxdawn closed 11 months ago

zxdawn commented 11 months ago

According to the notebook, I see the geotransform and spatial_ref attrs are used for orthorectification. Note that GT[2] and GT[4] are zero in the attrs. So, does that mean we can only use geotransform and spatial_ref for orthorectified data?

Actually, I have tried to use pyresample's AreaDefinition to define the area. The derived lon/lat values are different from these in the location group of NC file. I suppose that's because these attrs from the NetCDF file only works for orthorectified data.

If I'm wrong, please feel free to correct me. I'm curious how to use geotransform and spatial_ref for original data, if that should work ...

zxdawn commented 11 months ago

Here's the code to reproduce:

import xarray as xr
from pyresample.geometry import AreaDefinition

filename = 'EMIT_L1B_OBS_001_20230601T062832_2315205_005.nc'

ds = xr.open_dataset(filename)
ds_loc = xr.open_dataset(filename, group='location')

area_def = AreaDefinition('id', 'description', 'proj_id',
                          ds.attrs['spatial_ref'], ds.sizes['crosstrack'], ds.sizes['downtrack'],
                          [ds.attrs['westernmost_longitude'],
                           ds.attrs['southernmost_latitude'],
                           ds.attrs['easternmost_longitude'],
                           ds.attrs['northernmost_latitude']]
                          )

lons, lats = area_def.get_lonlats()

(ds_loc['lon']-lons).plot()
(ds_loc['lat']-lats).plot()

image image

ebolch commented 11 months ago

Yes, the geotransform and spatial ref are for the orthocorrected data.

When looking at the location group of the netcdf, the latitude and longitude values provided there are the pixel centers of the uncorrected, spatially raw (swath) data before gridding. The location group also includes a geometry lookup table (GLT), which is an orthocorrected product with a fixed pixel size on a WGS84 lat/lon grid. The geotransform and spatial_ref are for the GLT.

The GLT consists of 2 layers, glt_x, which contains the instrument crosstrack index from the swath dataset, and glt_y, which contains the instrument downtrack index. These allow you to use numpy indexing to place any of the L1 and L2 EMIT datasets (radiance, obs, reflectance, mask, mineralogy) into the GLT grid for the scene. This can be done with something like below, which is also included in the orthorectification notebook:

# Set filepath
fp =

# Open location group
loc = xr.open_dataset(fp, engine = 'h5netcdf', group='location')

# Create glt array
GLT_NODATA_VALUE=0
glt_array = np.nan_to_num(np.stack([loc['glt_x'].data,loc['glt_y'].data],axis=-1),nan=GLT_NODATA_VALUE).astype(int)
glt_array.shape

# Open Reflectance/Radiance
ds = xr.open_dataset(fp,engine = 'h5netcdf')
ds_array = ds['reflectance'].data
ds_array.shape

# Create Orthocrorrected Output Dataset
fill_value = -9999
out_ds = np.zeros((glt_array.shape[0], glt_array.shape[1], ds_array.shape[-1]), dtype=np.float32) + fill_value
valid_glt = np.all(glt_array != GLT_NODATA_VALUE, axis=-1)
# Adjust for One based Index
glt_array[valid_glt] -= 1 
# Use indexing/broadcasting to populate array cells with 0 values
out_ds[valid_glt, :] = ds_array[glt_array[valid_glt, 1], glt_array[valid_glt, 0], :]

There is also more information about the file structure in the user guide.

zxdawn commented 11 months ago

Thanks a lot for your explanations, Erik. It's clear to me now and pyresample's results are as same as manually calculated lon/lat:

import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
from pyresample.geometry import AreaDefinition

filename = 'EMIT_L1B_OBS_001_20230601T062832_2315205_005.nc'

ds = xr.open_dataset(filename)
ds_loc = xr.open_dataset(filename, group='location')

GT = ds.geotransform
dim_x = ds_loc['glt_x'].sizes['ortho_x']
dim_y = ds_loc['glt_x'].sizes['ortho_y']
lon = np.zeros(dim_x)
lat = np.zeros(dim_y)

for x in np.arange(dim_x):
    x_geo = (GT[0]+0.5*GT[1]) + x * GT[1]
    lon[x] = x_geo
for y in np.arange(dim_y):
    y_geo = (GT[3]+0.5*GT[5]) + y * GT[5]
    lat[y] = y_geo

area_def = AreaDefinition('id', 'description', 'proj_id',
                          ds.attrs['spatial_ref'], dim_x, dim_y,
                          [ds.attrs['westernmost_longitude'],
                           ds.attrs['southernmost_latitude'],
                           ds.attrs['easternmost_longitude'],
                           ds.attrs['northernmost_latitude']]
                          )

lons, lats = area_def.get_lonlats()

plt.imshow(lons - lon[None, :])
plt.colorbar()
plt.title('AreaDefinition lons - lon')

plt.imshow(lats - lat[:, None])
plt.colorbar()
plt.title('AreaDefinition lats - lat')

image

image