uaf-arctic-eco-modeling / Input_production

This repository contains scripts to download, process and format input data for terrestrial model simulationss
MIT License
1 stars 0 forks source link

notes for CRU-JRA stuff #3

Open tobeycarman opened 2 months ago

tobeycarman commented 2 months ago

This is not ready for the readme yet, but I need to record it somewhere:

The CRU-JRA data is hosted by the (CEDA)[https://archive.ceda.ac.uk] archive:

The Centre for Environmental Data Analysis (CEDA) runs the UK’s national data centre for atmospheric and earth observation research, hosting Petabytes of data.

Much of the data there, including CRU-JRA, requires an account for access. Follow the web instructions to create an account. Once you have an account, then sign in and navigate thru the site to the "Get Data" portion. The web interface looks like it has been updated recently. There are several download methods for different datasets it the archive, incluing direct web dowbload, FTP, and OpenDAP

The biggest crux with the CRU-JRA data for us is how it is stored. It is stored in NetCDF files, with one file for each year. Each file has the full spatial domain (global). So there does not appear to be an easy way to grab a spatial subset to work on. It looks like this data is not available over OpenDAP; if it were, then it would be possible to run the spatial subsetting prior to download.

It is roughly 230MB per file, 122 files for the time series, so ~(122*230) is ~28,000MB or 28GB for each variable. We need 4 variables, so a bit over 100GB. Ouch.

The data in the archive is at this path:

badc/cru/data/cru_jra

where several versions are available. Looks like 2.5 was published summer 2024. Permalink here: https://catalogue.ceda.ac.uk/uuid/43ce517d74624a5ebf6eec5330cd18d5/

The FTP server address: ftp.ceda.ac.uk

https://dap.ceda.ac.uk/badc/cru/data/cru_jra/cru_jra_2.5/data/pre/

General description of JRA methods: https://dap.ceda.ac.uk/badc/cru/data/cru_jra/cru_jra_1.1/data/CRUJRA_V1.1_Read_me.txt