leap-stc / data-management

Collection of code to manually populate the persistent cloud bucket with data
https://catalog.leap.columbia.edu/
Apache License 2.0
0 stars 5 forks source link

CESM2 CAM6 present day PPE Monthly Average (LWCF and SWCF) #116

Open yiqioyang opened 2 months ago

yiqioyang commented 2 months ago

Dataset Name

CESM2 CAM6 present day PPE Monthly Average (LWCF and SWCF)

Dataset URL

https://www.earthsystemgrid.org/dataset/ucar.cgd.cesm2.cam6.ppe.pd_timeseries.atm.hist.monthly_ave.LWCF/file.html and https://www.earthsystemgrid.org/dataset/ucar.cgd.cesm2.cam6.ppe.pd_timeseries.atm.hist.monthly_ave.SWCF/file.html

Description

I want to download two sets of nc files. They can be found at: https://www.earthsystemgrid.org/dataset/ucar.cgd.cesm2.cam6.ppe.pd_timeseries.atm.hist.monthly_ave.LWCF/file.html https://www.earthsystemgrid.org/dataset/ucar.cgd.cesm2.cam6.ppe.pd_timeseries.atm.hist.monthly_ave.SWCF/file.html;

Within each link, I hope to download all 262 files. The downloading python script can be accessed by the "Download Options for Selection" within the link, after all files are selected.

Size

2 262 7.61 MB

License

Unknown

Data Format

NetCDF

Data Format (other)

No response

Access protocol

HTTP(S)

Source File Organization

Each nc file corresponds to one variable (LWCF or SWCF) from one simulation. Each nc file has the dimension of time and space (lat and lon).

Example URLs

https://tds.ucar.edu/thredds/fileServer/datazone/campaign/cgd/projects/ppe/cam_ppe/rerun_PPE_250/PD/PD_timeseries/PPE_250_ensemble_PD.000/atm/hist/cc_PPE_250_ensemble_PD.000.h0.SWCF.nc?api-token=TGLy7EMWBpimnosAbrfUyvvlyAHRvHUjzrna0Mqz

Authorization

No; data are fully public

Transformation / Processing

The file name has useful information. For example, given the file name "cc_PPE_250_ensemble_PD.000.h0.LWCF.nc", here, the "000" after "PD." is the index (the first in this case) for this run.

If the 262 files were to stacked together, it would be great if such information is kept in the data (e.g., add a new dimension in the xarray object). Otherwise, it is also fine to download them as they are. I have the code to post-process them. Hopefully either way is not too complicated!

Target Format

Zarr

Comments

Please let me know if I wasn't clear or there is any issues downloading the data.

jbusecke commented 1 month ago

Hi @yiqioyang, thanks for raising this issue. The example URL you provided works for me (which means it is likely going to work for pangeo-forge). Unfortunately I am not able to access the full file list (the top link). But you are the expert on this dataset, so I would suggest we start setting up a feedstock for this dataset.

Could you go to https://github.com/leap-stc/LEAP_template_feedstock and follow the instructions there to build a basic pangeo-forge recipe? Please feel free to tag me on issues there if you should get stuck, and we will go from there.