google-research / arco-era5

Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.
https://cloud.google.com/storage/docs/public-datasets/era5
Apache License 2.0
287 stars 22 forks source link

Automating updates for ARCO-ERA5 in bigquery. #57

Closed dabhicusp closed 10 months ago

dabhicusp commented 11 months ago

This script is working in 4 phases:

  1. Download Raw Data: In the first stage, it will download the raw data of the AR and CO corpus using of the weather-dl tool.
  2. Data Splitting: In the next stage, it will split the soil and pcp variables. this splitting helps us to ingest this splitted variables into single-level-reanalysis, -surface zarr files of the CO Corpus.
  3. Zarr File Ingestion: Following data splitting, it will proceed to ingest this data into a Zarr file, which consists of two sub-steps:
    1. Increase the Zarr store.
    2. Ingest the actual data into the Zarr store.
  4. [WIP] Data Integration into BigQuery: Upon completing all the aforementioned steps, it will ingest all the data into BigQuery using the weather-mv tool.