Time the query using the Zarr-encoded data that is on S3.
(Not sure if we should do this) Time the query using the naive approach, ie. download NetCDF files and do the calculation. This will be slow because we are downloading huge amounts of irrelevant data. I think we want to do this as a baseline, but not sure.
Rechunk the data and save it on S3. Not sure, but I think doing it for the whole dataset is going to require a huge amount of resources. We should do it for a large, but manageable subset of the dataset. It needs to be large enough to make the benchmarks "realistic." This corresponds to https://github.com/azavea/noaa-hydro-data/issues/26.
This is a piece of #45