Closed raehik closed 1 year ago
Related: The process of obtaining GCP creds to use a requester pays bucket isn't all that straightforward, even as a previous GCP user. Should we provide some notes on what to do there?
The GCP creds stuff was added in #60 -- worked for onboarding @MarionBWeinzierl .
Related: we want to place the processed data (output of data processing step) on Hugging Face. That'd be a nice first step. See #74 .
Tentatively closed along with #74 .
The data is currently hosted on Pangeo. This is not the training data as such as some pre-processing is available. If someone wants to retrain then we need top provide guidance on how to obtain this and process it. We don't need to necessarily move the data from where it is currently hosted.
The dataset referenced in
cmip26.py
refers to a dataset which is in a requester-pays Google Cloud bucket. Nice for the host, but extremely inconvenient for users (especially without an explanation -- gcloud's error message is very confusing).Ideally, we provide access to example data to use with this software. There are two separate types of data that seem required: