leap-stc / data-management

Collection of code to manually populate the persistent cloud bucket with data
https://catalog.leap.columbia.edu/
Apache License 2.0
0 stars 5 forks source link

Soil moisture #17

Closed alexx-frcs closed 4 months ago

alexx-frcs commented 1 year ago

Dataset Name

CASM: A long-term Consistent Artificial-intelligence based Soil Moisture dataset based on machine learning and remote sensing

Dataset URL

https://zenodo.org/record/7072512#.ZFj3h-zMK3J

Description

The Consistent Artificial Intelligence (AI)-based Soil Moisture (CASM) dataset is a global, consistent, and long-term, remote sensing soil moisture (SM) dataset created using machine learning. It enables to solve the lack of data, by extrapolating data 13 years back with ML algorithms.

Size

The dataset consists of files 1.3 to 2.6 GB for a total size of 46.9 GB

License

Creative Commons Attribution 4.0 International

Data Format

NetCDF

Data Format (other)

No response

Access protocol

HTTP(S)

Source File Organization

There is one file per year. dimensions(sizes): date(62), lat(511), lon(1298) variables(dimensions): float64 CASM_soil_moisture(date, lat, lon), float64 data_uncertainty(date, lat, lon), int64 date(date), float64 lat(lat), float64 lon(lon), float64 seasonal_cycle(date, lat, lon), float64 structural_uncertainty(date, lat, lon)

Example URLs

No response

Authorization

None

Transformation / Processing

No response

Target Format

Zarr

Comments

No response

jbusecke commented 1 year ago

Working on this in #22

jbusecke commented 1 year ago

You can access the dataset here:

import xarray as xr
ds = xr.open_dataset('gs://leap-persistent/data-library/casm-595733423-4997696883-1/CASM.zarr', engine='zarr', chunks={}) 
ds

Closing this issue via #22

jbusecke commented 1 year ago

Reopening until this is in the LEAP Data catalog

jbusecke commented 4 months ago

Moved all logic to https://github.com/leap-stc/casm_feedstock as part of the ongoing refactor.