google-research / arco-era5

Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.
https://cloud.google.com/storage/docs/public-datasets/era5
Apache License 2.0
287 stars 22 forks source link

Blocked from accessing data #67

Closed loliverhennigh closed 8 months ago

loliverhennigh commented 9 months ago

Hey,

I was using ARCO ERA5 to generate a training dataset for neural networks. I was pulling the data last night and after pulling ~200 GB I started getting the bellow error. Not a super familiar with google cloud storage but is there limits put on how much data we can pull from ARCO ERA5?

    raise HttpError({"code": status, "message": msg})  # text-like
gcsfs.retry.HttpError: <html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"/><title>Sorry...</title><style> body { font-family: verdana, arial, sans-serif; background-color: #fff; color: #000; }</style></head><body><div><table><tr><td><b><font face=sans-serif size=10><font color=#4285f4>G</font><font color=#ea4335>o</font><font color=#fbbc05>o</font><font color=#4285f4>g</font><font color=#34a853>l</font><font color=#ea4335>e</font></font></b></td><td style="text-align: left; vertical-align: bottom; padding-bottom: 15px; width: 50%"><div style="border-bottom: 1px solid #dfdfdf;">Sorry...</div></td></tr></table></div><div style="margin-left: 4em;"><h1>We're sorry...</h1><p>... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.</p></div><div style="margin-left: 4em;">See <a href="https://support.google.com/websearch/answer/86640">Google Help</a> for more information.<br/><br/></div><div style="text-align: center; border-top: 1px solid #dfdfdf;"><a href="https://www.google.com">Google Home</a></div></body></html>, 429
[2023-12-15 10:08:15,139][gcsfs][ERROR] - _request out of retries on exception: <html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"/><title>Sorry...</title><style> body { font-family: verdana, arial, sans-serif; background-color: #fff; color: #000; }</style></head><body><div><table><tr><td><b><font face=sans-serif size=10><font color=#4285f4>G</font><font color=#ea4335>o</font><font color=#fbbc05>o</font><font color=#4285f4>g</font><font color=#34a853>l</font><font color=#ea4335>e</font></font></b></td><td style="text-align: left; vertical-align: bottom; padding-bottom: 15px; width: 50%"><div style="border-bottom: 1px solid #dfdfdf;">Sorry...</div></td></tr></table></div><div style="margin-left: 4em;"><h1>We're sorry...</h1><p>... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.</p></div><div style="margin-left: 4em;">See <a href="https://support.google.com/websearch/answer/86640">Google Help</a> for more information.<br/><br/></div><div style="text-align: center; border-top: 1px solid #dfdfdf;"><a href="https://www.google.com">Google Home</a></div></body></html>, 429

Minimal example that gives the error for me,

import fsspec
import xarray as xr

# Load arco xarray
fs = fsspec.filesystem('gs')
arco_filename = 'gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3'
arco_mapper = fs.get_mapper(arco_filename)
arco_era5 = xr.open_zarr(arco_mapper, consolidated=True)

save_dir = './'
var_name = "10m_u_component_of_wind"
zarr_path = save_dir / Path(f"{var_name}.zarr")

# Save
delayed_obj = arco_era5[var_name].to_zarr(zarr_path, consolidated=True, compute=False)

# Wait for save to finish
delayed_obj.compute()

*Fixed type in example

loliverhennigh commented 8 months ago

This issue seems to not be present anymore. Will reopen if it comes up again though.