After digging deep into the package and getting a sense for the structure of the data, I wonder if we can store the PM2.5 data in a flat matrix.
Currently, we store the data downloaded from bluesky as an array of matrices, where each matrix represents a 'slice' in the array. It may be more efficient to store and access the data if we store each matrix as a flattened vector, where each column would be a slice.
Right now our dimensions are for example, 50x42x71 if we flatten the slices of the array into a single matrix we can have the dimensions of 2100x71. This would allow for more efficient and flexible analysis.
One caveat is when plotting a slice or slices, we'd need to convert back each vector (slice) back to a matrix format with matrix(X, nrow = 50, ncol = 42). Just a suggestion, but could make working with data easier to understand.
Jon says its better if our arrays reflect the NetCDF structure of the data. Everyone thinks of this data as 3-D or 4-D and we don't want to change that.
After digging deep into the package and getting a sense for the structure of the data, I wonder if we can store the PM2.5 data in a flat matrix. Currently, we store the data downloaded from bluesky as an array of matrices, where each matrix represents a 'slice' in the array. It may be more efficient to store and access the data if we store each matrix as a flattened vector, where each column would be a slice. Right now our dimensions are for example,
50x42x71
if we flatten the slices of the array into a single matrix we can have the dimensions of2100x71
. This would allow for more efficient and flexible analysis. One caveat is when plotting a slice or slices, we'd need to convert back each vector (slice) back to a matrix format withmatrix(X, nrow = 50, ncol = 42)
. Just a suggestion, but could make working with data easier to understand.