Closed aaronspring closed 5 years ago
With SubX the data will be remote.
Is the data hosted on an OpenDAP server?
Is the data hosted on an OpenDAP server?
I just took a look at the script above, and it appears to be reading the data remotely via xarray+openDAP.
Would it be useful and feasible to build a catalog for SubX ?
Is it feasible? Yes, this is doable today. There was an issue about this on intake-esm issue tracker: https://github.com/NCAR/intake-esm/issues/175
Here's an example of catalog pointing to an OpenDAP server: http://haden.ldeo.columbia.edu/catalogs/hyrax_cmip6.json
Great. I will give this a try tomorrow. The json file looks like cmip6 data and hopefully the structure can get copied a bit.
Any ideas on running a builder there?
I dont get to the individual nc files: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.NCEP/.CFSv2/.forecast/.pr/dods
where I cannot look further. Anyone an idea?
import xarray as xr
url = 'http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.RSMAS/.CCSM4/.hindcast/.zg/dods'
remote_data = xr.open_dataarray(url, chunks={'S': 1, 'L': 1})
source: https://stackoverflow.com/questions/50240123/xarray-mean-of-data-stored-via-opendap
the subX output is already concated together in a useful form with dims (S, L, M, X, Y). sure having model included there would be nice, but the datasets are very heavy. I guess a more simple intake-xarray
yaml file also does it fine to start with.
plugins:
source:
- module: intake_xarray
sources:
subX:
description: SubX
driver: opendap
metadata:
url_origin: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/
#cache:
# - argkey: urlpath
# regex: ''
# type: file
parameters:
model:
description: model
type: str
default: NCEP
allowed: [CESM, ECCC, EMC, ESRL, GMAO, NCEP, NRL, RSMAS]
subdataset:
description: subdataset
type: str
default: 30LCESM1
allowed: [
30LCESM1, 46LCESM1, # CESM
GEM, GEPS5, GEPS6, #ECCC
GEFS, #EMC
FIMr1p1, #ESRL
GEOS_V2p1, # GMAO
NESM, #NRL
CCSM4, #RSNAS
]
cast:
description: hindcast or forecast
type: str
allowed: [hindcast, forecast]
variable:
description: variable name
type: str
default: ts
allowed: [ts, zg, va, ua, tas, rlut, pr, hfls, hfss, huss, mrso, psl, rad, ROMI, snc, stx, sty, swe, tasmax, tasmin, uas, vas, wap]
args:
urlpath: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.{{model}}/.{{subdataset}}/.{{cast}}/.{variable}/dods
chunks: {'S': 1, 'L': 1}
Would it be useful and feasible to build a catalog for SubX ?
Or would it be more useful/easy to just build a catalog based on intake-xarray? With SubX the data will be remote.
https://github.com/kpegion/SubX/blob/master/Python/download_data/generate_ts_py_ens_files.ksh