trey-stafford commented 1 year ago

Dataset that includes data from 2018-2021. This has a longer temporal range than our current layer (2019-2020) and at a higher spatial resolution (100m instead of 250m) but coverage is not complete over Greenland.

Is there another, more up-to-date dataset that's worth considering?

twilamoon-science commented 1 year ago

Datasets from this project (https://nsidc.org/grimp) include really nice ice velocity raster data. Including these or replacing current layers might be primarily about file size options and ease of working with these data. (Though that team has been pretty attentive to data stuff recently, so I would hope it is pretty accessible/usable.) This GRIMP options is preferable to the CCI data option that Trey links to.

trey-stafford commented 1 year ago

Looks like GRIMP has a few dataset options at different temporal resolutions:

Annual mosaics are available from 2014-2021
Quarterly and monthly mosaics are available from 2014-2022
6/12 day mosaics from 2015- May 2023

Our current CCI velocity layers are for 2019-2020.

twilamoon-science commented 1 year ago

What would it look like to include the GRIMP annual mosaics re: changes in project file size? And perhaps 1 year (2022?) with quarterly or monthly mosaics?

trey-stafford commented 1 year ago

At the full 200m resolution, the annual mosaics from 2014-2021 take up ~1.3G of disk space. The package size with these layers is 4.1G.

trey-stafford commented 1 year ago

To clarify, the above refers to only the magnitude of velocity. There are also data available for the vx and vy components, as well as error estimates (ex and ey) . Moreover, there are shapefiles which indicate the data source used to derive the velocity estimates.

I have not looked into adding those yet, but we have added error estimates for other datasets that include them. I would guess those fields would probably add a similar amount of disk usage.

We could consider downsampling to save some space (native resolution is 200m)

MattF-NSIDC commented 1 year ago

At the full 200m resolution, the annual mosaics from 2014-2021 take up ~1.3G of disk space

Each, or total? If the latter, divided by 8 years, that's below 200MB for just 2021.

We could also vectorize the data to combine the magnitude, vx, and vy components into one layer, but then we have to worry about symbology a lot more.

trey-stafford commented 1 year ago

Each, or total? If the latter, divided by 8 years, that's below 200MB for just 2021.

Yes, in total.

We could also vectorize the data to combine the magnitude, vx, and vy components into one layer, but then we have to worry about symbology a lot more.

Hmm, contours might make sense for visualization but they would look different for each component. The alternative would be polygons or points representing the center of each cell. Not really very useful for analysis (well, it could be depending on application), although I suppose a user could generate a raster from such a point dataset.

trey-stafford commented 1 year ago

Another option would be to scale the data and store it as integers like we do for Arctic DEM (not currently in the core package).

The source data is Float32, with values that look like e.g., 2.8341431617736816 (m/y). Unclear from preliminary review of the user guide what the actual precision of the data is (how many digits are significant?).

Just to try out the idea, I rounded to 2 decimals and converted to an integer with a scale factor of 0.01 (e.g., 2.8341431617736816 is rounded to 2.83 and stored as the integer 283 with metadata that lets QGIS know that it should interpret 283 as 2.83).

This reduces the size of the vv layers from ~1.3G to ~565M (~82M/layer) and the package size from 4.1G to 3.4G vs storing the full-resolution Float32 data.

trey-stafford commented 1 year ago

According to the user guide (emphasis mine):

While the data are posted at 200 m, the true resolution varies between a few hundred meters to 1.5 km. Posting represents the spacing between samples and should not be confused with the resolution at which the data were collected. Many small glaciers are resolved outside the main ice sheet, but for narrow (<1 km) glaciers, the velocity represents an average of both moving ice and stationary rock. As a result, while the glacier may be visible in the map, the actual speed may be underestimated. For smaller glaciers, interpolation produces artifacts where the interpolated value is derived from nearby rock, causing apparent stationary regions in the middle of otherwise active flow. The data have been screened to remove most of these artifacts, but should be used with caution.

So I tried downsampling to 1.5km resolution. Combined with the scaling described above, the total data volume is reduced to 16M (~2.2M / layer). The zip package size is ~2.9G

trey-stafford commented 1 year ago

598#issuecomment-1684359924

MattF-NSIDC commented 1 year ago

the total data volume is reduced to 16M (~2.2M / layer)

:pinching_hand:

trey-stafford commented 1 year ago

With quarterly vv mosaics added, the total GrIMP data volume increases to ~23M (again, with scaling & downsampling to 1.5km). Total package size stays around 2.9G.

trey-stafford commented 1 year ago

Putting this on hold until we can chat about how to proceed as a team on Monday. Key items to address:

What data should we include?
- Yearly mosaics for the whole time range or just the last couple of years?
- Quarterly mosaics for 2022?
Round and scale the data? This would limit the data resolution of velocity to 1cm
Downsample the data to 1.5km (or some other intermediate value)?
Include ancillary datasets?

trey-stafford commented 1 year ago

Just add 2021 from this data source to get people using GrIMP. Add x & y components with magnitude. With rounding/scaling.

Consider something like wind vectors. One magnitude map (raster) & one vector direction map (vector).

Lets keep thinking about adding the error components.

trey-stafford commented 1 year ago

We have a preprocessing script that creates the wind vectors from RACMO data. We would need to adapt this to be run as a command step to utilize it for the GrIMP velocity data. I think we could approach that by either adapting the current code to take CLI args and execute it as a normal CommandStep or develop some other support for running python functions on data.

For now, I am adding the annual vv, vx, and vy layers for 2021 as rasters and they contribute ~235M.

MattF-NSIDC commented 1 year ago

think we could approach that by either adapting the current code to take CLI args and execute it as a normal CommandStep or develop some other support for running python functions on data.

Feels like we shouldn't try to fit this in before v3 :)

nsidc / qgreenland

update ice velocity data #663

598 is also related. It links to a repo with qgis project file that has references to GrIMP internet-hosted data, but doesn't seem practical for use in QGreenland. See: https://github.com/nsidc/qgreenland/issues/598#issuecomment-1684359924