eurec4a / eurec4a-intake

Intake catalogue for EUREC4A field campaign datasets
17 stars 19 forks source link

EUREC4A Intake catalogue

weekly_tests.yml

This repository contains an intake catalogue for acessing data from the EUREC4A field campaign stored on: 1) AERIS and 2) Munich University (via OPeNDAP) and 3) OPeNDAP access to files at NOAA's National Center for Environmental Information and 4) data linked via IPFS.

Usage

To use you will need to install intake, xarray, intake-xarray, zarr, pydap, requests, s3fs and ipfsspec

pip install "intake<2.0.0" xarray intake-xarray zarr pydap s3fs requests ipfsspec

Or, if you feel courageous (and want the newest updates), you can also install the requirements.txt directly:

pip install -r https://raw.githubusercontent.com/eurec4a/eurec4a-intake/master/requirements.txt

The catalogue (and underlying data) can then be accessed directly from python:

> from intake import open_catalog
> cat = open_catalog("https://raw.githubusercontent.com/eurec4a/eurec4a-intake/master/catalog.yml")

You can list the available sources with:

>> list(cat)
['radiosondes', 'barbados', 'dropsondes', 'halo', 'p3', 'specmacs']

>> list(cat.radiosondes)
['atalante_meteomodem',
 'atalante_vaisala',
 'bco',
 'meteor',
 'ms_merian',
 'ronbrown']

Then load up a dask-backed xarray.Dataset so that you have access to all the available variables and attributes in the dataset:

>> ds = cat.radiosondes.ronbrown.to_dask()
>> ds
<xarray.Dataset>
Dimensions:      (alt: 3100, nv: 2, sounding: 329)
Coordinates:
  * alt          (alt) int16 0 10 20 30 40 50 ... 30950 30960 30970 30980 30990
    flight_time  (sounding, alt) datetime64[ns] dask.array<chunksize=(83, 775), meta=np.ndarray>
    lat          (sounding, alt) float32 dask.array<chunksize=(83, 1550), meta=np.ndarray>
    lon          (sounding, alt) float32 dask.array<chunksize=(83, 1550), meta=np.ndarray>
    sounding_id  (sounding) |S1000 dask.array<chunksize=(165,), meta=np.ndarray>
Dimensions without coordinates: nv, sounding
Data variables:
    N_gps        (sounding, alt) float32 dask.array<chunksize=(83, 1550), meta=np.ndarray>
    N_ptu        (sounding, alt) float32 dask.array<chunksize=(83, 1550), meta=np.ndarray>
    alt_bnds     (alt, nv) int16 dask.array<chunksize=(3100, 2), meta=np.ndarray>
...

You can then slice and access the data as if you had it available locally

Contributing

Please have a look at our contribution guide.