Closed Mikejmnez closed 1 year ago
I will do this over the weekend as it will take a couple of days and I want to avoid issues when reading the files.
This is currently underway. See PR #326. It should be done later tonight. For now, velocity data is not available (since that data is being copied into that zarr store). I will close this issue once with a following PR which will remove two entries from the catalog (scalars and forcing) and restore the velocity entry (which will have all field variables). I might have a different name.
Description
Currently, there are too many zarr stores needed to read the LLC4320 sample dataset, each associated with a Catalog entry. Initially we only had two zarr stores (one with S, Temp and Eta and another one with the GRID). Then I included the velocities into a separate zarr store, and lastly another separate zarr store with all forcing variables. Because reading from the catalog involves a for loop that iterates over all possible entries, the less zarr files the faster we can read the dataset via the Catalog.
Currently, in Sciserver it takes about 2 minutes to execute the following snippet:
I think we can cut some time (maybe 30 seconds?) by having less files and thus less catalog entries.