Doodleverse / holodoodler

HoloDoodler; semi-interactive image segmentation implemented using HoloViews
3 stars 1 forks source link

Making Pixel Values of GeoTIFF Images Compatible with HoloDoodler and doodler-engine's Segmentation Model #6

Closed venuswku closed 1 year ago

venuswku commented 1 year ago

I encountered the following warning and error while testing HoloDoodler with GeoTIFF files.

WARNING:param.RGBPlot01329: Clipping input data to the valid range for RGB data ([0..1] for floats or [0..255] for integers).

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

dbuscombe-usgs commented 1 year ago

Thanks for the detailed issue

Perhaps imagery should always be scaled to 8-bit (the way you implemented that seesm fine) and its dtype should be updated using .astype(np.uint8)

Basically, what you have done, but without the option for imagery to be 16-bit, 32-bit or 64-bit

If that happens, does the error message go away?

venuswku commented 1 year ago

Yes, the error message goes away if I only test HoloDoodler with imagery that are 8-bit. It works for another_geotiff.tif though (mentioned above), which is 16-bits, so I dug a little deeper out of curiosity.

Turns out that data type or bit size was not the cause of the value error! It was the image array that was passed into sklearn's validation function.

In _compute_segmentation() from .\doodler\components.py, self.input_image.array is passed into doodler-engine's segmentation() function:

self._segmentation = segmentation(
    img=self.input_image.array,
    mask=self._mask_doodles,
    crf_theta_slider_value=self.settings.as_dict()['crf_theta'],
    crf_mu_slider_value = self.settings.as_dict()['crf_mu'],
    rf_downsample_value = self.settings.as_dict()['rf_downsample_value'],
    crf_downsample_factor = self.settings.as_dict()['crf_downsample_factor'],
    n_sigmas = self.settings.as_dict()['n_sigmas'],
    multichannel = self.settings.as_dict()['multichannel'],
    intensity = self.settings.as_dict()['intensity'],
    edges = self.settings.as_dict()['edges'],
    texture = self.settings.as_dict()['texture'],
    sigma_min = self.settings.as_dict()['sigma_min'],
    sigma_max = self.settings.as_dict()['sigma_max']
)

This image array is eventually passed as an argument into _assert_all_finite() in sklearn\utils\validation.py, where the value error is raised:

def _assert_all_finite(X, allow_nan=False, msg_dtype=None):
    """Like assert_all_finite, but only for ndarray."""
    # validation is also imported in extmath
    from .extmath import _safe_accumulator_op

    if _get_config()["assume_finite"]:
        return
    X = np.asanyarray(X)
    # First try an O(n) time, O(1) space solution for the common case that
    # everything is finite; fall back to O(n) space np.isfinite to prevent
    # false positives from overflow in sum method. The sum is also calculated
    # safely to reduce dtype induced overflows.
    is_float = X.dtype.kind in "fc"
    if is_float and (np.isfinite(_safe_accumulator_op(np.sum, X))):
        pass
    elif is_float:
        msg_err = "Input contains {} or a value too large for {!r}."
        if (
            allow_nan
            and np.isinf(X).any()
            or not allow_nan
            and not np.isfinite(X).all()
        ):
            type_err = "infinity" if allow_nan else "NaN, infinity"
            raise ValueError(
                msg_err.format(
                    type_err, msg_dtype if msg_dtype is not None else X.dtype
                )
            )
    # for object dtype data, we only check for NaNs (GH-13254)
    elif X.dtype == np.dtype("object") and not allow_nan:
        if _object_dtype_isnan(X).any():
            raise ValueError("Input contains NaN")

I looked for where self.array was assigned in the InputImage class and realized that the scaled pixels array was never used for segmentation. The original image array (without scaling) for 2019-06-18-18-57-05_L8_KLAMATH_ms.tif contains some negative infinity values, causing the value error to be raised. self.array was already assigned to the original image array before I scaled the pixels:

@param.depends('location', watch=True)
def _load_image(self):
    if not self.location:
        self._plot = self._pane.object = hv.RGB(data=[])
        return
    self.array = array = self.read_from_fs(self.location)     # self.array was assigned here!

    # this is where we want to split the image array used for doodling
    # and the n-band array for segmentation
    if np.ndim(array) <=2:
        array = np.dstack((array,array,array))

    h, w, nbands = array.shape
    if nbands > 3:
        img = array[:, :, 0:3].copy()
    else:
        img = array.copy()

    # Make sure image array is within the range
    # [0, 255] for integers or [0, 1] for floats.
    if np.issubdtype(img.dtype, np.integer) and not (np.all(img >= 0) and np.all(img <= 255)):
        img = (img / np.amax(img) * 255).astype(np.uint8)
    elif np.issubdtype(img.dtype, np.floating) and not (np.all(img >= 0) and np.all(img <= 1)):
        # Infinity can only be represented as a float as of right now,
        # so we don't need the following two lines for scaling integers.
        img[img == float("-inf")] = float(0)
        img[img == float("inf")] = float(1)
        img = img / np.amax(img)

    # Preserve the aspect ratio
    self.img_bounds = (0, 0, w, h)
    self._plot = self._pane.object = hv.RGB(
        img, bounds=self.img_bounds
    ).opts(aspect=(w / h))

So I just simply assigned self.array after the pixel values were scaled. I even removed the .astype(np.uint8) code and segmentation still works for another_geotiff.tif. Looks like I didn't need to convert the pixel values into 8-bit integers for the image to be segmented. Should I still convert the scaled pixel values into np.uint8 just in case?

@param.depends('location', watch=True)
def _load_image(self):
    if not self.location:
        self._plot = self._pane.object = hv.RGB(data=[])
        return
    array = self.read_from_fs(self.location)

    # Create image with 3 bands for images with 2 or less bands.
    if np.ndim(array) <= 2:
        array = np.dstack((array,array,array))

    # Split the image array used for doodling and the n-band array for segmentation.
    h, w, nbands = array.shape
    if nbands > 3:
        img = array[:, :, 0:3].copy()
    else:
        img = array.copy()

    # Make sure image array is within the range
    # [0, 255] for integers or [0, 1] for floats.
    if np.issubdtype(img.dtype, np.integer) and not (np.all(img >= 0) and np.all(img <= 255)):
        # Convert the division results back to integers because division creates float results.
        img = (np.rint(img / np.amax(img) * 255)).astype(int)
    elif np.issubdtype(img.dtype, np.floating) and not (np.all(img >= 0) and np.all(img <= 1)):
        # Infinity can only be represented as a float as of right now,
        # so we don't need the following two lines for scaling integers.
        img[img == float("-inf")] = float(0)
        img[img == float("inf")] = float(1)
        img = (img / np.amax(img))

    # Set self.array after its pixel values have been scaled to the expected range.
    self.array = img

    # Preserve the aspect ratio.
    self.img_bounds = (0, 0, w, h)
    self._plot = self._pane.object = hv.RGB(
        img, bounds=self.img_bounds
    ).opts(aspect=(w / h))

Segmenting 2019-06-18-18-57-05_L8_KLAMATH_ms.tif (float32) works with no value errors now! float32_input_on_leafmap Here are my results: 20221024-141517.zip. Sorry, the segmentation is a bit off this time too because the image is so dark. I just realized the part where I marked water is actually land now that I see it on a map. Here are the image outputs in leafmap:

dbuscombe-usgs commented 1 year ago

Looks great! Thanks for digging further!

Should I still convert the scaled pixel values into np.uint8 just in case?

Perhaps doodle a 16-bit and a 32-bit input image, and if they work well, no need to assert a dtype of np.uint8

venuswku commented 1 year ago

I tested with 2019-06-18-18-57-05_L8_KLAMATH_ms.tif (float32) from above, 2021-10-29-18-57-40_L8_KLAMATH_ms.tif (float32), and another_geotiff.tif (uint16) from above. All 8-bit and 32-bit GeoTIFFs could be doodled and labeled so far, but I would like to test with some more images! How do I get 16-bit imagery? I read that Landsat 8-9 are 16-bit, but I'm struggling to download those satellite images.

Edit: Thanks to Dan's suggestion, I'm looking into converting the 32-bit images into 16-bit images with gdal or rasterio. Once I can confirm that my fix can scale 16-bit images, then my pull request can be merged.

venuswku commented 1 year ago

So I tried converting 32-bit (float) GeoTIFFs into 16-bit (unsigned integer) GeoTIFFs with GDAL's Python API:

from osgeo import gdal

input_file = "examples/images/2019-12-27-18-57-29_L8_KLAMATH_ms.tif"
input_dataset = gdal.Open(input_file)
total_bands = input_dataset.RasterCount
driver = gdal.GetDriverByName("GTiff")
output_dataset = driver.Create(
    "examples/images/16BIT_2019-12-27-18-57-29_L8_KLAMATH_ms.tif",
    xsize = input_dataset.RasterXSize,
    ysize = input_dataset.RasterYSize,
    bands = total_bands,
    eType = gdal.GDT_UInt16
)
output_dataset.SetProjection(input_dataset.GetProjection())
output_dataset.SetGeoTransform(input_dataset.GetGeoTransform())
for i in range(total_bands):
    input_band = input_dataset.GetRasterBand(i+1)
    arr = input_band.ReadAsArray()
    output_band = output_dataset.GetRasterBand(i+1)
    output_band.WriteArray(arr)
output_band.FlushCache()
input_dataset = None
output_dataset = None

and GDAL's command line:

gdal_translate -ot UInt16 -if GTiff -of GTiff "C:/Users/Venus/holodoodler/examples/images/2019-12-27-18-57-29_L8_KLAMATH_ms.tif" "C:/Users/Venus/holodoodler/examples/images/16BIT_2019-12-27-18-57-29_L8_KLAMATH_ms.tif"

but both ways outputted completely dark images (example output image), which are hard to segment. I remember reading a StackOverflow post saying that the image might be very dark if the pixel values are too big so that might be why the outputted images are hard to see. I'm not too familiar with GDAL though so please correct me if I used it wrong!

Instead of converting float32 GeoTIFFs, I slightly modified a GDAL script to create my own segmentable uint16 and float64 GeoTIFFs. An example script is shown below:

from osgeo import gdal, osr

# Initialize the Image Size
image_size = (500,500)
# Choose some Geographic Transform (Around Lake Tahoe)
lat = [39,38.5]
lon = [-120,-119.5]
# Create Each Channel
r_pixels = np.zeros((image_size), dtype=np.uint16)
g_pixels = np.zeros((image_size), dtype=np.uint16)
b_pixels = np.zeros((image_size), dtype=np.uint16)
# Set the Pixel Data (Create some boxes)
for x in range(0,image_size[0]):
    for y in range(0,image_size[1]):
        if x < image_size[0]/2 and y < image_size[1]/2:
            r_pixels[y,x] = 3000
        elif x >= image_size[0]/2 and y < image_size[1]/2:
            g_pixels[y,x] = 2150
        elif x < image_size[0]/2 and y >= image_size[1]/2:
            b_pixels[y,x] = 3894
        else:
            r_pixels[y,x] = 9038
            g_pixels[y,x] = 5843
            b_pixels[y,x] = 2954
# Set Geotransform
nx = image_size[0]
ny = image_size[1]
xmin, ymin, xmax, ymax = [min(lon), min(lat), max(lon), max(lat)]
xres = (xmax - xmin) / float(nx)
yres = (ymax - ymin) / float(ny)
geotransform = (xmin, xres, 0, ymax, 0, -yres)
# Create the 3-band Raster File
dst_ds = gdal.GetDriverByName('GTiff').Create('examples/images/custom_16_bit.tif', ny, nx, 3, gdal.GDT_UInt16)
dst_ds.SetGeoTransform(geotransform)    # specify coords
srs = osr.SpatialReference()            # establish encoding
srs.ImportFromEPSG(3857)                # WGS84 lat/long
dst_ds.SetProjection(srs.ExportToWkt()) # export coords to file
dst_ds.GetRasterBand(1).WriteArray(r_pixels)   # write r-band to the raster
dst_ds.GetRasterBand(2).WriteArray(g_pixels)   # write g-band to the raster
dst_ds.GetRasterBand(3).WriteArray(b_pixels)   # write b-band to the raster
dst_ds.FlushCache()                     # write to disk
dst_ds = None

Compared to dark images, the outputted image of the above script is easier to segment: custom_16_bit_geotiff I was able to doodle and segment the outputted images from this script. This means my scaling fix works for images in the most common numerical data types (uint8, uint16, float32, float64). My PR can be merged!

dbuscombe-usgs commented 1 year ago

Excellent stuff!