OpenDrift / opendrift

Open source framework for ocean trajectory modelling
https://opendrift.github.io
GNU General Public License v2.0
246 stars 120 forks source link

Grib2 files #425

Closed Nichsouz closed 3 years ago

Nichsouz commented 3 years ago

Dear support,

knutfrode commented 3 years ago

Hi,

The GRIB-reader is quite primitive and inefficient, based on pygrib. I would like to make a new GRIB reader based on Xarray/cfgrib, which looks quite promising: http://xarray.pydata.org/en/stable/examples/ERA5-GRIB-example.html

Anyway, it is often more convenient to fetch data directly through Thredds than having to downloading files. There are some thredds-servers providing various variants of GFS, but since there are GRIB-files and not netCDF-files behind, the variables are not mapped to CF standard_name. But it is possible to make a workaround with the netCDF-CF-reader like this:

from opendrift.readers.reader_netCDF_CF_generic import Reader

r = Reader('https://thredds-jumbo.unidata.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_0p5deg_ana/TP',
           standard_name_mapping={'u-component_of_wind_altitude_above_msl': 'x_wind',
                                  'v-component_of_wind_altitude_above_msl': 'y_wind'})

Could that be a workaround in the meantime? I have not tested this much, so not sure it is completely safe.

Nichsouz commented 3 years ago

Thanks for the fast response.

However, the thredds I could find are relatively recent. I need old database, such as 2019 and early 2020 GFS data. Is there any old database where I can find this data?

knutfrode commented 3 years ago

I am not very familiar with NCEP thredds servers, but you could have a look here if you find something useful: https://thredds-jumbo.unidata.ucar.edu/thredds/catalog.html https://www.ncei.noaa.gov/thredds/model/model.html

Some places you find only individual files, but it is much more convenient with thredds aggregates, with a single URL for the whole dataset.

Nichsouz commented 3 years ago

I am still struggling to insert NCEP data in Opendrift or any other for the Mediterranean region in the 2019 year time. Apparently, the .grb2 data contains different keywords from those expected in opendrift. I've been trying to insert wind and temperature from NCEP, the thread you previously mentioned. I tried opening it on HDFViewer and I perceived that they call the wind parameter "UGRD_10maboveground" and V... However, I could notice in the .nc files in the examples that wind is named "u" and "v" only. I believe this could be the cause of my issue. I understand it would oblige me to manually insert the files for every 6 hours of my timeseries. However, all my attempts to insert one single file has failed. They are recognised by o.add_readers_from_list command, but when I configure (o.seed) and run, it discards the data. The r = Reader you have guided me to use has not worked in any attempt. The run script warns me that x_wind and y_wind are missing, although I used the thread.

Please, give me some light.

knutfrode commented 3 years ago

The variable names do not matter, as these are ad hoc and named differently by any provide/producer. It is the standard_name attribute which is fixed according to CF-convention, and which OpenDrift (and other CF-compliant software) detects. The problem with these GRIB-files is that they are not CF-compliant, which is the reason why you need to apply the name-mapping as suggested above.

If you post the URL of the dataset you are trying to use, I can have a look at it.

Nichsouz commented 3 years ago

I would like to work from Aug-04, 2019 to Aug-10, 2019. I need the data from https://www.ncei.noaa.gov/thredds/catalog/model-gfs-003-files-old/201908/catalog.html.

Thanks in advance!

knutfrode commented 3 years ago

In these files lon/lat are stored in arrays simply named lon and lat, without any metadata information. However, after the latest update a few minutes ago, reader_netCDF_CF_generic will now detect and use these coordinates.

You still need to apply standard_name_mapping, since standard_name is not defined for any of the variables, e.g.:

r = reader_netCDF_CF_generic.Reader(
    'https://www.ncei.noaa.gov/thredds/dodsC/model-gfs-003-files-old/201908/20190829/gfs_3_20190829_1800_378.grb2',
    standard_name_mapping={'u-component_of_wind_planetary_boundary': 'x_wind',
                           'v-component_of_wind_planetary_boundary': 'y_wind'})

Since these files contain only a single time step, they may not be used for a simulation outside of this exact time, unless you explicitly say that the reader shall be valid "always":

reader_map.always_valid = True
reader_map.buffer_size = 1000

Unfortunately there seem to be no aggregate available (where multiple files/times are merged together into a time series behind a single URL). Probably the same mapping method (but with different variable names/mappings) may now also be applied to other GRIB-data available on Thredds.

gauteh commented 3 years ago

Closing, re-open if necessary.