Preprocessing GRIBS should output files that ultimately will be easy to query and subset, so
[x] When pivoting, add and integer ID for the model/iteration
[x] Split the id into date, leadtime, variable, and product_type. Fully convert dates.
[x] Since we are most likely to query by time and variable than spatial subset, arrange in order by time, variable, product_type, then x and y. This means the longest runs are by single dates.
This will set up for efficient parquet or similar partitioning in the future. A small part of me thinks that we should actually be processing/converting into a dense format for efficient spatio-temporal raster queries, like Web-Optimized GeoTIFF, but that's for way later.
Preprocessing GRIBS should output files that ultimately will be easy to query and subset, so
This will set up for efficient parquet or similar partitioning in the future. A small part of me thinks that we should actually be processing/converting into a dense format for efficient spatio-temporal raster queries, like Web-Optimized GeoTIFF, but that's for way later.