Closed trey-stafford closed 1 year ago
Datasets from this project (https://nsidc.org/grimp) include really nice ice velocity raster data. Including these or replacing current layers might be primarily about file size options and ease of working with these data. (Though that team has been pretty attentive to data stuff recently, so I would hope it is pretty accessible/usable.) This GRIMP options is preferable to the CCI data option that Trey links to.
Looks like GRIMP has a few dataset options at different temporal resolutions:
Our current CCI velocity layers are for 2019-2020.
What would it look like to include the GRIMP annual mosaics re: changes in project file size? And perhaps 1 year (2022?) with quarterly or monthly mosaics?
At the full 200m resolution, the annual mosaics from 2014-2021 take up ~1.3G of disk space. The package size with these layers is 4.1G.
To clarify, the above refers to only the magnitude of velocity. There are also data available for the vx
and vy
components, as well as error estimates (ex
and ey
) . Moreover, there are shapefiles which indicate the data source used to derive the velocity estimates.
I have not looked into adding those yet, but we have added error estimates for other datasets that include them. I would guess those fields would probably add a similar amount of disk usage.
We could consider downsampling to save some space (native resolution is 200m)
At the full 200m resolution, the annual mosaics from 2014-2021 take up ~1.3G of disk space
Each, or total? If the latter, divided by 8 years, that's below 200MB for just 2021.
We could also vectorize the data to combine the magnitude, vx, and vy components into one layer, but then we have to worry about symbology a lot more.
Each, or total? If the latter, divided by 8 years, that's below 200MB for just 2021.
Yes, in total.
We could also vectorize the data to combine the magnitude, vx, and vy components into one layer, but then we have to worry about symbology a lot more.
Hmm, contours might make sense for visualization but they would look different for each component. The alternative would be polygons or points representing the center of each cell. Not really very useful for analysis (well, it could be depending on application), although I suppose a user could generate a raster from such a point dataset.
Another option would be to scale the data and store it as integers like we do for Arctic DEM (not currently in the core package).
The source data is Float32, with values that look like e.g., 2.8341431617736816
(m/y). Unclear from preliminary review of the user guide what the actual precision of the data is (how many digits are significant?).
Just to try out the idea, I rounded to 2 decimals and converted to an integer with a scale factor of 0.01 (e.g., 2.8341431617736816
is rounded to 2.83
and stored as the integer 283
with metadata that lets QGIS know that it should interpret 283
as 2.83
).
This reduces the size of the vv
layers from ~1.3G to ~565M (~82M/layer) and the package size from 4.1G to 3.4G vs storing the full-resolution Float32 data.
According to the user guide (emphasis mine):
While the data are posted at 200 m, the true resolution varies between a few hundred meters to 1.5 km. Posting represents the spacing between samples and should not be confused with the resolution at which the data were collected. Many small glaciers are resolved outside the main ice sheet, but for narrow (<1 km) glaciers, the velocity represents an average of both moving ice and stationary rock. As a result, while the glacier may be visible in the map, the actual speed may be underestimated. For smaller glaciers, interpolation produces artifacts where the interpolated value is derived from nearby rock, causing apparent stationary regions in the middle of otherwise active flow. The data have been screened to remove most of these artifacts, but should be used with caution.
So I tried downsampling to 1.5km resolution. Combined with the scaling described above, the total data volume is reduced to 16M (~2.2M / layer). The zip package size is ~2.9G
the total data volume is reduced to 16M (~2.2M / layer)
:pinching_hand:
With quarterly vv
mosaics added, the total GrIMP data volume increases to ~23M (again, with scaling & downsampling to 1.5km). Total package size stays around 2.9G.
Putting this on hold until we can chat about how to proceed as a team on Monday. Key items to address:
Just add 2021 from this data source to get people using GrIMP. Add x & y components with magnitude. With rounding/scaling.
Consider something like wind vectors. One magnitude map (raster) & one vector direction map (vector).
Lets keep thinking about adding the error components.
We have a preprocessing script that creates the wind vectors from RACMO data. We would need to adapt this to be run as a command step to utilize it for the GrIMP velocity data. I think we could approach that by either adapting the current code to take CLI args and execute it as a normal CommandStep
or develop some other support for running python functions on data.
For now, I am adding the annual vv
, vx
, and vy
layers for 2021 as rasters and they contribute ~235M.
think we could approach that by either adapting the current code to take CLI args and execute it as a normal CommandStep or develop some other support for running python functions on data.
Feels like we shouldn't try to fit this in before v3 :)
Dataset that includes data from 2018-2021. This has a longer temporal range than our current layer (2019-2020) and at a higher spatial resolution (100m instead of 250m) but coverage is not complete over Greenland.
Is there another, more up-to-date dataset that's worth considering?