Open cisaacstern opened 1 year ago
The second production deployment began about ten minutes ago. I'll follow up here with updates.
This failed on another malformed url issue (really, a missing data issue), which had slipped through the cracks due to an error in the unit tests. I fixed this in #5, and redeployed on merge of that PR. 🤞 Updates to follow.
The third deployment failed with the error reported in #6, which hopefully fixes the issue. The build is now deployed for a fourth time, updates to follow.
The fourth deployment proceeded well, caching 2655 of some 2900 or so files before stalling of unclear reasons.
I've just re-deployed from #7.
🎉 Success!
import xarray as xr
p = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/aqua-modis-feedstock/aqua-modis-682286948-6702057605-1/aqua-modis.zarr"
ds = xr.open_dataset(p, engine="zarr", chunks={})
ds.nbytes/1e9 # --> 757.954773176 GB
ds
<xarray.Dataset>
Dimensions: (time: 967, lat: 4320, lon: 8640)
Coordinates:
* lat (lat) float32 89.98 89.94 89.9 89.85 ... -89.9 -89.94 -89.98
* lon (lon) float32 -180.0 -179.9 -179.9 -179.9 ... 179.9 179.9 180.0
* time (time) datetime64[ns] 2002-07-04 2002-07-12 ... 2023-07-20
Data variables:
bbp_443 (time, lat, lon) float64 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
chlor_a (time, lat, lon) float32 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
qual_sst (time, lat, lon) uint8 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
sst (time, lat, lon) float64 dask.array<chunksize=(1, 2160, 4320), meta=np.ndarray>
Attributes: (12/39)
Conventions: CF-1.6 ACDD-1.3
cdm_data_type: grid
creator_email: data@oceancolor.gsfc.nasa.gov
creator_name: NASA/GSFC/OBPG
creator_url: https://oceandata.sci.gsfc.nasa.gov
easternmost_longitude: 180.0
... ...
standard_name_vocabulary: CF Standard Name Table v36
suggested_image_scaling_applied: No
sw_point_latitude: -89.97916412353516
sw_point_longitude: -179.9791717529297
title: MODISA Level-3 Standard Mapped Image
westernmost_longitude: -180.0
Opening this issue as a place to track progress of production deployments on GCP Dataflow. So far:
make_dates
function had some inaccurate assumptions built into it.