Open sbrunato opened 2 weeks ago
@sbrunato I started looking at the cds tutorial and one potential issue I can see with replacing the call to .download
by a call to eodag-cube's .get_data
is that it requires a band
argument which in the case of grib data is not known until we download the data and look at the file names.
Unless there's a way to know it in advance and I'm not aware of it?
@sbrunato I started looking at the cds tutorial and one potential issue I can see with replacing the call to
.download
by a call to eodag-cube's.get_data
is that it requires aband
argument which in the case of grib data is not known until we download the data and look at the file names.Unless there's a way to know it in advance and I'm not aware of it?
As per discussion with @sbrunato, let's use a regular expression similar to .*\.grib$
to match the single grib file that will have been downloaded.
While plotting the data from the cds tutorial I noticed that something is weird with what is returned by get_data
:
import cartopy.crs as crs
import cartopy.feature as cfeature
# europe crop
mask_lon = (ds.x >= -12) & (ds.x <= 48)
mask_lat = (ds.y >= 32) & (ds.y <= 72)
ds_eu = ds.where(mask_lon & mask_lat, drop=True)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1,1,1, projection=crs.PlateCarree())
ds_eu.plot(ax=ax, transform=crs.PlateCarree())
ax.add_feature(cfeature.COASTLINE)
ax.gridlines(draw_labels=True)
This is supposed to be temperature so clearly something is off, these values can't be correct whether they'd be celcius or kelvin.
Here is a script that highlights the issue. First we look at the DataArray returned by get_data
, then we look at the dataset using xarray directly:
import xarray as xr
from eodag import EODataAccessGateway
dag = EODataAccessGateway()
search_results = dag.search(
provider="cop_ads",
productType="CAMS_EAC4",
start="2021-01-01",
end="2021-01-02",
variable="temperature",
model_level="1",
time="00:00",
format="grib",
)
product = search_results[0]
da = product.get_data(band=r"\.grib$")
print("### Object returned by get_data ###")
print(f"Object type: {type(da)}")
print(f"Min value: {da.data.min()}")
print(f"Max value: {da.data.max()}")
print(f"GRIB_UNIT: {da.attrs['GRIB_UNIT']}")
print(f"long_name: {da.attrs['long_name']}")
print(f"GRIB_COMMENT': {da.attrs['GRIB_COMMENT']}")
print()
print("### Object returned by xarray.open_dataset ###")
ds = xr.open_dataset("/tmp/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de.grib")
da = ds.t
print(f"Object type: {type(da)}")
print(f"Min value: {da.data.min()}")
print(f"Max value: {da.data.max()}")
print(f"GRIB_units: {da.attrs['GRIB_units']}")
print(f"long_name of variable t: {da.attrs['long_name']}")
print(f"long_name of dimension hybrid: {ds.hybrid.attrs['long_name']}")
And here is the output:
### Object returned by get_data ###
Object type: <class 'xarray.core.dataarray.DataArray'>
Min value: -55.20513000488279
Max value: -7.723486328124977
GRIB_UNIT: [C]
long_name: 1[-] HYBL="Hybrid level"
GRIB_COMMENT': Temperature [C]
### Object returned by xarray.open_dataset ###
Object type: <class 'xarray.core.dataarray.DataArray'>
Min value: 217.9448699951172
Max value: 265.426513671875
GRIB_units: K
long_name of variable t: Temperature
long_name of dimension hybrid: hybrid level
If understand correctly what we're trying to do, the DataArray returned by .get_data
should be the same as the variable t
of the dataset returned by xarray.open_dataset
. But we can see that something is wrong, as if .get_data
has mixed up several elements together.
If I plot the dataset returned by xarray called directly, it looks exactly the same though:
import xarray as xr
ds = xr.open_dataset("/tmp/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de/CAMS_EAC4_20210101_20210102_4d792734017419d1719b53f4d5b5d4d6888641de.grib")
def normalize_longitude(ds):
return ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')
ds = normalize_longitude(ds)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1,1,1, projection=crs.PlateCarree())
ds.t[0].sel(latitude=slice(72, 32), longitude=slice(-12, 48)).plot(ax=ax, transform=crs.PlateCarree())
ax.add_feature(cfeature.COASTLINE)
ax.gridlines(draw_labels=True)
So it seems that the metadata are a bit messed up and that there's an offset applied to the values, but they seem to represent the same thing.
Checking the difference between both arrays:
>>> da.values - ds_eu.values
array([[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
...,
[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15],
[273.15, 273.15, 273.15, ..., 273.15, 273.15, 273.15]])
The difference happens to be the kelvin offset!
EDIT: OK I understand why I get weird values, this because we use the variable temperature
which is not the ground temperature. We should use 2m_temperature
instead to get values which make more sense to an average human. There are still issues with the metadata but at least we can plot a map which looks as expected.
Updating meteoblue tutorial is blocked by https://github.com/CS-SI/eodag/issues/1413
@amarandon You should use eodag-cube's built-in cropping feature instead of doing it yourself.
Done :heavy_check_mark:
Update the following tutorials, using
eodag-cube
:Also use
cartopy
, like on this example that plotsSea ice area fraction
from andedt_lumi
product type on an European cropped area: