Closed jkingslake closed 3 years ago
Jonny thanks a lot for sharing this.
This pangeo-datastore
repository has become sort of unmaintained. The reason is that we are moving to a new, much more ambitious and comprehensive platform for populating the cloud data library called Pangeo Forge. You can read about it here: https://pangeo-forge.readthedocs.io/
The main difference is that Pangeo Forge will actually build the Zarr dataset in the cloud out of the original sources. This resolves one of the central problems with the old approach: the breaking of the provenance chain from the original data (in your case, from PANGAEA) to the cloud-optimized format.
We would LOVE to create a Pangeo Forge recipe for this dataset. To get the process started, it would be awesome if someone could open up an issue here: https://github.com/pangeo-forge/staged-recipes/issues
In the meantime, there is not much point adding catalog entries to this catalog. It will be shut down soon.
Just noting something quite interesting. The original data for this are stored in a giant Zip file: https://hs.pangaea.de/model/PISM/Albrecht-etal_2019/parameter-ensemble/Part2_pism_paleo_ensemble_v2.zip.
However, I was able to easily open it an load files directly using fsspec.
import xarray as xr
from fsspec.implementations.zip import ZipFileSystem
url = "https://hs.pangaea.de/model/PISM/Albrecht-etal_2019/parameter-ensemble/Part2_pism_paleo_ensemble_v2.zip"
fs = ZipFileSystem(url)
fs.ls("datapub) # -> list the files
import xarray as xr
with fs.open('datapub/model_data/pism1.0_paleo06_6255/snapshots_-10000.000.nc') as fp:
ds = xr.open_dataset(fp)
ds.load()
ds.thk.plot()
So it may be quite easy to get the recipe going.
Hello,
@talbrecht and I recently put output from his ensemble of ice-sheet model simulations of the Antarctic Ice Sheet over the last 120ka on GCS.
Data citation: Albrecht, Torsten (2019): PISM parameter ensemble analysis of Antarctic Ice Sheet glacial cycle simulations. PANGAEA, https://doi.pangaea.de/10.1594/PANGAEA.909728
Here is a notebook showing how to access these data in pangeo.
Would this be an appropriate dataset to include in the Pangeo catalog?
This dataset is already part of an [intake catalog] ('https://raw.githubusercontent.com/ldeo-glaciology/pangeo-pismpaleo/main/paleopism.yaml'), so I assume that it would simply be a case of adding a cryo.yaml here that points to this dataset and potentially others. The data can stay in our google bucket and would only need to become 'requester pays' if we get a lot of interest, which would be great!
If this is something people would like to see happen (and my assumption about what it entails is correct), I can make a PR.