rubisco-sfa / ILAMB-Data

A collection of scripts used to format ILAMB data and community portal to make contributions
9 stars 3 forks source link

Questions about contribute to ilamb-data collection #60

Closed rhaegar325 closed 3 months ago

rhaegar325 commented 4 months ago

Hi, team:

Recently, we would like to contribute to ilamb-data collection, and we would like to know what kind of dataset was accepted:

  1. spatial and temporal resolution: is there a min/max for spatial and temporal resolution.
  2. point vs gridded datasets: can we have both or is it only gridded datasets?
  3. spatial extent: is there any requirements on the overall spatial extent of the dataset? Can it cover any region? Or is there a min. size? we would like to know if it's possible to have Australia-only data in ilamb-data.
  4. Is it possible to add urban specific datasets in ILAMB-data?
msteckle commented 3 months ago

Great to hear! See this tutorial for some guidance: format data tutorial. Please also reference recent scripts to see how we pre-process, such as these gridded datasets: convert GFW, convert FLUXCOM or this point dataset: convert Ameriflux

  1. We spatially resample gridded data to 0.5 degree resolution (EPSG:4326), which with spatially global bounds is a netcdf with 720 x 360 cells. There are not time constraints, but applicable time bounds (e.g., 1950 to 2020) must be specified in the netcdf you create.
  2. ILAMB can handle point and gridded datasets. ILAMB will sample the model grid at each point for comparison. See the Ameriflux example I linked above.
  3. There is no spatial extent requirement for the benchmark dataset. For example, here, you can see that the bias and bias scores are spatially constrained to the extent of the benchmark data: https://www.ilamb.org/CMIP5v6/historical/EcosystemandCarbonCycle/Biomass/Tropical/Tropical.html
  4. Yes, if there is an urban-specific variable from the benchmark dataset that corresponds to an urban-specific variable in the land model you're comparing to, then you can do so.

In general, be sure to format your netcdf using CF Conventions and, if possible, format variable names according to CMIP6 variables grouped by MIP table or all accepted CMIP6 variables.

To contribute to this repo:

  1. Create a new issue with a description of the benchmark dataset you'd be adding, plus any links or useful information in the body and some explanation for why it's a useful bench-marking dataset
  2. Fork from ILAMB-Data
  3. Create a new folder to work in (we generally name it after the folks/project who made the dataset; name it whatever you like)
  4. Write your convert file inside the folder you created
  5. When finished, submit a pull request for us to review