Open AmyMacFadyen opened 2 years ago
I am getting the error too, here is the dump for reference and I'm looking into it:
(model_catalogs) kthyng@adams LibGOODS % python libgoods/scripts/examples/gfs-1deg/fetch.py
Setting up source catalog
Source catalog generated in 3.1 s
Generating catalog specific for GFS-1DEG forecast
Specific catalog generated in 15.1 ms
Getting xarray dataset for model data
/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/xarray/core/indexes.py:150: FutureWarning: Value based partial slicing on non-monotonic DatetimeIndexes with non-existing keys is deprecated and will raise a KeyError in a future Version.
indexer = index.slice_indexer(
Created dask-based xarray dataset in 18.6 s
Traceback (most recent call last):
File "libgoods/scripts/examples/gfs-1deg/fetch.py", line 46, in <module>
sys.exit(main())
File "libgoods/scripts/examples/gfs-1deg/fetch.py", line 41, in main
fetch(config)
File "/Users/kthyng/projects/model_catalogs/model_catalogs/examples.py", line 141, in fetch
ds = source.to_dask()
File "/Users/kthyng/projects/model_catalogs/model_catalogs/process.py", line 69, in to_dask
self._ds = self._ds.cf.sel(
File "/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/cf_xarray/accessor.py", line 598, in wrapper
result = final_func(*posargs, **arguments)
File "/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/xarray/core/dataset.py", line 2533, in sel
query_results = map_index_queries(
File "/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/xarray/core/indexing.py", line 183, in map_index_queries
results.append(index.sel(labels, **options)) # type: ignore[call-arg]
File "/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/xarray/core/indexes.py", line 377, in sel
indexer = _query_slice(self.index, label, coord_name, method, tolerance)
File "/Users/kthyng/miniconda3/envs/model_catalogs/lib/python3.8/site-packages/xarray/core/indexes.py", line 158, in _query_slice
raise KeyError(
KeyError: "cannot represent labeled-based slice indexer for coordinate 'time3' with a slice over integer positions; the index is unsorted or non-unique"
This should be fixed by https://github.com/NOAA-ORR-ERD/model_catalogs/pull/17 but I need to wait until it fully merges to test
@AmyMacFadyen GFS model output has a ton of variables and it looks like different time variables are not monotonic ... at different times. I don't know too much about it. But, I thought I'd check to see if you actually use many of the variables. If not, dumping some of them might help with keeping consistent read in. Which variables do you use?
Yeah, that one is weird. All we really want is wind at 10 m. In GOODS I was accessing this from u-component_of_wind_height_above_ground (and v...)
It looks like I've had trouble getting the right coordinate variables in the past so instead of hard coding them, I was looking at the u/v variables and then figuring out which time and height_above_ground variables to grab since there are multiples.
At present it looks like its coordinates are time1 and height_above_ground2
Note that we don't need to keep the 3D aspect here -- so just the 0th index in the depth (height) dimension will yield the 10 m wind.
We do use air temp in GNOME but at present its just a scalar input and not read from the met model. But for future compatibility we should probably grab it. Unfortunately, it looks like this is on a different vertical grid and available at 2 m rather than 10 m. But lets add it regardless unless the different vertical grids makes it too hard (you can collapse them all to the zeroth index and just ignore the difference).
A fix on the libgoods side here: https://github.com/NOAA-ORR-ERD/LibGOODS/pull/41
I have a fix in my present branch for model_catalogs
for the rest to finish this but haven't merged the branch yet.
This should work now with the newest versions of model_catalogs
and LibGOODS
. @AmyMacFadyen could you give it a try?
Yes, will do. The main branch of LibGOODS looks like it still contains model catalogs -- I thought that was deleted? Is there a different branch I should try?
Ah, sorry, you are right. Best to wait until I have done updates to that, probably tomorrow, just so everything is clear. Thanks.
@kthyng This does appear to be fixed now. Only thing I'd like to check on is the option for extracting the "surface" -- I think Luke implemented that originally in examples.py, not sure where it exists now. But it seems to be based on getting the level closest to 0 in the vertical. For wind, we want the 10 m wind for the surface only option. I think what's there may just work for GFS since 10 m is the lowest atmospheric level output but its worth thinking about to make sure we're doing it right and we have a flexible UI that in future could allow selection of model output at a particular user-specified depth.
With the latest (as of 2022-08-24) the --surface
option works correctly for GFS as well now.
$ fetch-model GFS-1DEG -t forecast -n eastward_wind,northward_wind --surface -s 2022-08-01 -e 2022-08-02 -f
Setting up source catalog
Source catalog generated in 453.9 ms
Generating catalog specific for GFS-1DEG forecast
Specific catalog generated in 21.3 ms
Getting xarray dataset for model data
Created dask-based xarray dataset in 1160.1 ms
Selecting only surface data.
Indexed surface data in 20.2 ms
Subsetting data
Subsetted dataset in 7.4 ms
Writing netCDF data to output/GFS-1DEG_forecast_20220801-20220802.nc. This may take a long time...
Wrote output to disk in 2.1 s
Complete
import netCDF4 as nc4
nc = nc4.Dataset('/home/luke/code/LibGOODS/output/GFS-1DEG_forecast_20220801-20220802.nc')
nc['u-component_of_wind_height_above_ground'].coordinates
--> 'reftime time height_above_ground2 lat lon '
nc['height_above_ground2'][:]
--> masked_array(data=10.,
mask=False,
fill_value=1e+20,
dtype=float32)
nc['height_above_ground2'].units
--> 'm'
We use cf-xarray to find the surface that is nearest to Z=0
, for the most part. If the dataset is somewhat to mostly CF compliant, it works very well for this purpose. We've also improved the metadata internally for several offerings within model_catalogs so that tools like cf-xarray work well with the data.
Highlights:
I tried running the GFS example and it fails with this error:
KeyError: "cannot represent labeled-based slice indexer for coordinate 'time3' with a slice over integer positions; the index is unsorted or non-unique"