Open lewfish opened 2 years ago
For sample benchmark #1, it's not clear if we should be running these aggregations for each stream individually, or across all streams. It's also not clear if "daily averages" should be averaged across all days in the dataset, or we should be computing an average for each individual day. I would also like to know typical values for the number of HUC8s, and the length of date/time range.
While the most likely query pattern for gridded output data will be by HUCs, the performance of the Zarr library relative to the size of the query needs to be benchmarked. To that end, we will perform this benchmarking for several HUC 8, 10, and 12 regions. These will also be recorded as a sample documentation notebook in the project repository.