CS-SI / eodag

Earth Observation Data Access Gateway
https://eodag.readthedocs.io
Apache License 2.0
328 stars 45 forks source link

update tutorials using eodag-cube #1385

Open sbrunato opened 2 weeks ago

sbrunato commented 2 weeks ago

Update the following tutorials, using eodag-cube:

Also use cartopy , like on this example that plots Sea ice area fraction from an dedt_lumi product type on an European cropped area:

import cartopy.crs as crs
import cartopy.feature as cfeature

# europe crop
mask_lon = (ds.longitude >= 348) | (ds.longitude <= 48)
mask_lat = (ds.latitude >= 32) & (ds.latitude <= 72)
ds_eu = ds.where(mask_lon & mask_lat, drop=True)

fig = plt.figure(figsize=(8,6))

ax = fig.add_subplot(1,1,1, projection=crs.Mercator())

ax.add_feature(cfeature.COASTLINE)
ax.gridlines(draw_labels=True)

sc = plt.scatter(
    x=ds_eu.longitude[::5], y=ds_eu.latitude.data[::5], c=ds_eu.siconc[::5], 
    cmap="cool", s=1, transform=crs.PlateCarree()
)
plt.colorbar(sc, label=ds.siconc.long_name)

plt.show()

image

amarandon commented 2 weeks ago

@sbrunato I started looking at the cds tutorial and one potential issue I can see with replacing the call to .download by a call to eodag-cube's .get_data is that it requires a band argument which in the case of grib data is not known until we download the data and look at the file names.

Unless there's a way to know it in advance and I'm not aware of it?

amarandon commented 1 week ago

@sbrunato I started looking at the cds tutorial and one potential issue I can see with replacing the call to .download by a call to eodag-cube's .get_data is that it requires a band argument which in the case of grib data is not known until we download the data and look at the file names.

Unless there's a way to know it in advance and I'm not aware of it?

As per discussion with @sbrunato, let's use a regular expression similar to .*\.grib$ to match the single grib file that will have been downloaded.

amarandon commented 5 days ago

While plotting the data from the cds tutorial I noticed that something is weird with what is returned by get_data:

import cartopy.crs as crs
import cartopy.feature as cfeature

# europe crop
mask_lon = (ds.x >= -12) & (ds.x <= 48)
mask_lat = (ds.y >= 32) & (ds.y <= 72)
ds_eu = ds.where(mask_lon & mask_lat, drop=True)

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1,1,1, projection=crs.PlateCarree())

ds_eu.plot(ax=ax, transform=crs.PlateCarree())

ax.add_feature(cfeature.COASTLINE)
ax.gridlines(draw_labels=True)

image

This is supposed to be temperature so clearly something is off, these values can't be correct whether they'd be celcius or kelvin.

Here is a script that highlights the issue. First we look at the DataArray returned by get_data, then we look at the dataset using xarray directly:

import xarray as xr
from eodag import EODataAccessGateway

dag = EODataAccessGateway()

search_results = dag.search(
    provider="cop_ads",
    productType="CAMS_EAC4",
    start="2021-01-01",
    end="2021-01-02",
    variable="temperature",
    model_level="1",
    time="00:00",
    format="grib",
)

product = search_results[0]
da = product.get_data(band=r"\.grib$")

print("### Object returned by get_data ###")
print(f"Object type: {type(da)}")
print(f"Min value: {da.data.min()}")
print(f"Max value: {da.data.max()}")
print(f"GRIB_UNIT: {da.attrs['GRIB_UNIT']}")
print(f"long_name: {da.attrs['long_name']}")
print(f"GRIB_COMMENT': {da.attrs['GRIB_COMMENT']}")

print()
print("### Object returned by xarray.open_dataset ###")
ds = xr.open_dataset("/tmp/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de.grib")
da = ds.t
print(f"Object type: {type(da)}")
print(f"Min value: {da.data.min()}")
print(f"Max value: {da.data.max()}")
print(f"GRIB_units: {da.attrs['GRIB_units']}")
print(f"long_name of variable t: {da.attrs['long_name']}")
print(f"long_name of dimension hybrid: {ds.hybrid.attrs['long_name']}")

And here is the output:

### Object returned by get_data ###
Object type: <class 'xarray.core.dataarray.DataArray'>
Min value: -55.20513000488279
Max value: -7.723486328124977
GRIB_UNIT: [C]
long_name: 1[-] HYBL="Hybrid level"
GRIB_COMMENT': Temperature [C]

### Object returned by xarray.open_dataset ###
Object type: <class 'xarray.core.dataarray.DataArray'>
Min value: 217.9448699951172
Max value: 265.426513671875
GRIB_units: K
long_name of variable t: Temperature
long_name of dimension hybrid: hybrid level

If understand correctly what we're trying to do, the DataArray returned by .get_data should be the same as the variable t of the dataset returned by xarray.open_dataset. But we can see that something is wrong, as if .get_data has mixed up several elements together.

If I plot the dataset returned by xarray called directly, it looks exactly the same though:

import xarray as xr
ds = xr.open_dataset("/tmp/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de.grib")

def normalize_longitude(ds):
    return ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')

ds = normalize_longitude(ds)

fig = plt.figure(figsize=(8,6))

ax = fig.add_subplot(1,1,1, projection=crs.PlateCarree())

ds.t[0].sel(latitude=slice(72, 32), longitude=slice(-12, 48)).plot(ax=ax, transform=crs.PlateCarree())

ax.add_feature(cfeature.COASTLINE)
ax.gridlines(draw_labels=True)

image

So it seems that the metadata are a bit messed up and that there's an offset applied to the values, but they seem to represent the same thing.

Checking the difference between both arrays:

>>> da.values - ds_eu.values
array([[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
       [273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
       [273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
       ...,
       [273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
       [273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
       [273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15]])

The difference happens to be the kelvin offset!

EDIT: OK I understand why I get weird values, this because we use the variable temperature which is not the ground temperature. We should use 2m_temperature instead to get values which make more sense to an average human. There are still issues with the metadata but at least we can plot a map which looks as expected.

amarandon commented 4 days ago

Updating meteoblue tutorial is blocked by https://github.com/CS-SI/eodag/issues/1413

amarandon commented 3 days ago

@amarandon You should use eodag-cube's built-in cropping feature instead of doing it yourself.

Done :heavy_check_mark: