Closed rsignell-usgs closed 6 years ago
I've written a good chunk of NWM data to Zarr on S3 in bucket rsignell/nwm/test04
, with variables in the 60-100GB range:
xarray.Dataset>
Dimensions: (reference_time: 961, time: 961, x: 4608, y: 3840)
Coordinates:
* reference_time (reference_time) datetime64[ns] 2018-03-02 ...
* time (time) datetime64[ns] 2018-03-02T01:00:00 ...
* x (x) float64 -2.304e+06 -2.303e+06 -2.302e+06 -2.301e+06 ...
* y (y) float64 -1.92e+06 -1.919e+06 -1.918e+06 -1.917e+06 ...
Data variables:
LWDOWN (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
PSFC (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
Q2D (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
RAINRATE (time, y, x) float32 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
SWDOWN (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
T2D (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
U2D (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
V2D (time, y, x) float64 dask.array<shape=(961, 3840, 4608), chunksize=(1, 3840, 4608)>
Here's a plot of the mean temperature:
computed by this notebook: https://gist.github.com/rsignell-usgs/a55c5d825467e8ce118462e8a39965ad
We could use some more cores! 😜
BTW, the python script I used to write this is here:
https://gist.github.com/rsignell-usgs/df7b936f28f2212f80872a7f30098680
My AWS credentials to write to this bucket were stored in ~/.aws/config
Write 80GB or so of NWS data to Zarr format on S3. This should be sufficient for initial testing and demos.