NREL / nsrdb

This repository contains all of the methods for the NSRDB data processing pipeline.
https://nrel.github.io/nsrdb/
BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

"Shark's teeth" artifacts in composite albedo data #43

Closed mikebannis closed 2 years ago

mikebannis commented 2 years ago

Bug Description The current system for mapping IMS snow data to the MODIS snow free albedo to create a composite albedo has a flaw that sometimes results in the erroneous assignment of "snowy" or "dry" along snow/no-snow boundaries. These artifacts resemble triangles or shark's teeth and are most noticeable above roughly 60N latitude.

A nearest-neighbor approach is used to find the IMS data that is closest to a given MODIS raster cell. A consequence of the polar projection is each raster cells has completely unique lat and longs, requiring a large KD tree to map IMS data to the MODIS data, which is stored as a more common WGS-84 based raster format. Making a KD tree from the full IMS data set is impractical from both a memory and time perspective. Fortunately, for the purposes of making the composite albedo, we don't care which IMS point is nearest but only if it's snowy or dry. Further, there are typically large snowy and dry areas in the IMS data. We only need to know if a modis point is in a snowy or dry area. The existence of large snowy and dry areas can be used to dramatically reduce the number of IMS points needed to classify any MODIS cell as snowy or dry.

The IMS snow data is saved in a polar projection as shown below from day 190 of 1998. The north pole is in the center of the image, with the Himalayas towards the top in yellow, and the Alaskan and Canadian pacific coast towards the bottom left. Yellow cells are snow, orange is sea ice, and blue is snow and ice free. For the NSRDB, snow and sea ice are considered equivalent. image

Below is the same image zoomed and centered on Greenland. image

The scipy.ndimage.morphology.binary_dilation() function is used to find the boundaries of the snow and dry regions. Example output is shown below, blue indicates the boundary. image

The edge layer indicates the boundary between snowy and dry, but does not provide enough information to classify an arbitrary location as snowy or dry. To include more information, the edge is buffered using the binary_dilation() function. This expands the edge from one pixel wide to three as shown below, yellow indicates the boundary pixels (please ignore the change in color scheme). image

In theory, the buffered boundary should contain enough information to accurately classify any location as snowy or dry. Viewing the composite data shows a number of "shark's teeth" at higher latitudes as shown in the image below of Greenland. White indicates snow from the IMS data. Black or gray areas are considered dry and show the higher resolution data from MODIS. Examples of shark's teeth are circled. image

This occurs due to the mismatch between the two projections. When "unwrapped", it becomes apparent that the IMS cell density near the north pole is much less than that near the equator. The image below shows the composite albedo overlayed with the IMS points, with shape and color indicating snow, sea ice, or dry. Note the low point density near the north pole. image

The image below shows the selected IMS points in the polar projection after the buffer. Yellow are snowy points, maroon are dry points, and blue points are considered unnecessary for building the KD tree. Looking at the selected points in a polar projection is appears there is enough information to define arbitrary point as snowy or dry. image

The below image shows the selected points unwrapped to WGS 84 and overlaid on the composite albedo. Note that at locations of the shark's teeth there appear to be missing yellow IMS snow dots, effectively "exposing" the pink dry dots, and creating the shark's teeth. While the selected points for the KD tree appears adequate in the polar projection, when converted to WGS 84 it is apparent that more points are needed to accurately define the snow/no-snow boundary. This is generally only a probably at higher latitudes. At lower latitudes the density and general orientation of points is more consistent between the two projections. image

mikebannis commented 2 years ago

Solution

An obvious solution is to use more points to build the KD tree by increasing the number of iterations in the binary_dilation() step. Spot tests showed that 5 iterations appear to be adequate to remove the shark's teeth when viewed visually in QGIS. A per cell comparison using GDAL also showed no difference in the composite albedo between 5 and 6 iterations (for 24km IMS), indicating that 5 iterations is adequate. It was decided to use 6 iterations in the NSRDB code to be conservative. Memory usage and run times are acceptable for 90GB eagle nodes. Below are example albedo results for 24km, 4km, and 1km IMS data. The original albedo with 1 iteration is shown first, with 6 iterations shown underneath. All images are centered on Greenland.

24km - 1999/190 image image

4km - 2006/180 image image

1km - 2017/180 image image

1km - 2017/180 (zoomed) image image

grantbuster commented 2 years ago

closed in #39