wincowgerDEV / OpenSpecy-package

Analyze, Process, Identify, and Share, Raman and (FT)IR Spectra
http://wincowger.com/OpenSpecy-package/
Creative Commons Attribution 4.0 International
23 stars 11 forks source link

[Feature]: Smoothing Maps #165

Closed wincowgerDEV closed 5 months ago

wincowgerDEV commented 7 months ago

Guidelines

Description

I have been thinking about better ways to do map analysis recently. I think the way we are identifying particles with the signal/noise is pretty unique. I have been concerned though that noise isn't being well removed and some small particles are being totally lost due to them having just a few strong pixels. I think that smoothing of the spectra spatially and/or smoothing of the signal/noise spatially could help with this.

I found this package that has a matrix smoother we could use: https://search.r-project.org/CRAN/refmans/oce/html/matrixSmooth.html

Source code is in cpp: https://github.com/dankelley/oce/blob/develop/src/matrix_smooth.cpp

Problem

Spectra are spatially highly variable.

Proposed Solution

Spatial spectral smoothing or signal/noise smoothing spatially.

Alternatives Considered

current status quo is raw feature analysis.

wincowgerDEV commented 7 months ago

Some other options than pure mean are gaussian smooth: https://search.r-project.org/CRAN/refmans/soundgen/html/gaussianSmooth2D.html

This mmand package is pretty universal, could replace imagr: https://github.com/jonclayden/mmand?tab=readme-ov-file#skeletonisation https://rdrr.io/cran/mmand/man/gaussianSmooth.html https://github.com/jonclayden/mmand

wincowgerDEV commented 7 months ago

Put a little work into the mmand option, the gaussian smooth strategy looks the best so far.

#' @rdname def_features
#'
#' @importFrom imager label as.cimg
#' @importFrom data.table as.data.table setDT rbindlist data.table
#' @export
def_features.OpenSpecy <- function(x, features, ...) {
    if(is.logical(features)) {
        if(all(features) | all(!features))
            stop("features cannot be all TRUE or FALSE because that would indicate ",
                 "that there are no distinct features", call. = F)

        features_df <- .def_features(x, features)
    } else if(is.character(features)) {
        if(length(unique(features)) == 1)
            stop("features cannot all have a single name because that would ",
                 "indicate that there are no distinct features", call. = F)

        features_df <- rbindlist(lapply(unique(features),
                                        function(y) .def_features(x, features == y))
        )
    } else {
        stop("features needs to be a character or logical vector", call. = F)
    }

    obj <- x
    x <- y <- feature_id <- NULL # workaround for data.table non-standard
    # evaluation
    md <- features_df[setDT(obj$metadata), on = c("x", "y")]
    md[, feature_id := ifelse(is.na(feature_id), "-88", feature_id)]
    md[, "centroid_x" := mean(x), by = "feature_id"]
    md[, "centroid_y" := mean(y), by = "feature_id"]

    obj$metadata <- md

    return(obj)
}

#' @importFrom grDevices chull
#' @importFrom stats dist
.def_features <- function(x, binary, name = NULL) {
    # Label connected components in the binary image
    binary_matrix <- matrix(binary, ncol = max(x$metadata$x) + 1, byrow = T)
    labeled_image <- imager::label(imager::as.cimg(binary_matrix),
                                   high_connectivity = T)

    # Create a dataframe with feature IDs for each true pixel
    feature_points_dt <- data.table(x = x$metadata$x,
                                    y = x$metadata$y,
                                    feature_id = ifelse(binary_matrix,
                                                        labeled_image, -88) |>
                                        t() |> as.vector() |> as.character())

    # Apply the logic to clean components
    cleaned_components <- ifelse(binary_matrix, labeled_image, -88)

    # Calculate the convex hull for each feature
    convex_hulls <- lapply(
        split(
            as.data.frame(which(cleaned_components >= 0, arr.ind = TRUE)),
            cleaned_components[cleaned_components >= 0]
        ),
        function(coords) {coords[unique(chull(coords[,2], coords[,1])),]
        })

    # Calculate area, Feret max, and feature IDs for each feature
    features_dt <- rbindlist(lapply(seq_along(convex_hulls), function(i) {
        hull <- convex_hulls[[i]]
        id <- names(convex_hulls)[i]
        if(nrow(hull) == 1)
            return(data.table(feature_id = id,
                              area = 1,
                              perimeter = 4,
                              feret_min = 1,
                              feret_max = 1)
            )

        # Calculate Feret dimensions
        dist_matrix <- as.matrix(dist(hull))
        feret_max <- max(dist_matrix) + 1

        perimeter <- 0
        cols = 1:nrow(hull)
        rows = c(2:nrow(hull), 1)
        for (j in 1:length(cols)) {
            # Fetch the distance from the distance matrix
            perimeter <- perimeter + dist_matrix[rows[j], cols[j]]
        }

        # Area
        area <- sum(cleaned_components == as.integer(id))

        feret_min = area/feret_max #Can probably calculate this better.

        data.table(feature_id = id,
                   area = area,
                   perimeter = perimeter,
                   feret_min = feret_min,
                   feret_max = feret_max
        )
    }), fill = T)

    # Join with the coordinates from the binary image
    if (!is.null(name)) {
        features_dt$feature_id <- paste0(name, "_", features_dt$feature_id)
    }

    feature_points_dt[features_dt, on = "feature_id"]
}

map <- read_extdata("CA_tiny_map.zip") |> read_any()

signal <- sig_noise(map, metric = "sig_times_noise")

heatmap_spec(map, z = signal)

signal_matrix <- matrix(signal, ncol = max(map$metadata$x) + 1, byrow = T)

smooth_signal_matrix <- gaussianSmooth(signal_matrix, sigma = c(1,1))
display(signal_matrix)
display(smooth_signal_matrix)
k <- shapeKernel(c(3,3), type="box")
display(medianFilter(signal_matrix, k))

smooth_thresholded <- threshold(smooth_signal_matrix, method = "kmeans")
display(smooth_thresholded, col=rainbow(max(signal_components,na.rm=TRUE)))

k <- shapeKernel(c(3,3), type="box")
smooth_components <- components(smooth_thresholded, k)

display(smooth_components, col=rainbow(max(smooth_components,na.rm=TRUE)))

library(loder)
library(mmand)
B <- readPng(system.file("images", "B.png", package="mmand"))
k <- shapeKernel(c(3,3), type="diamond")

display(B)
display(skeletonise(B,k,method="lantuejoul"), col="red", add=TRUE)
wincowgerDEV commented 5 months ago

Available in 75411f12f685fe4a62e26cfbb239bb6eca7bba9d for both 2d and 3d