ailich / GLCMTextures

This R package calculates the most common gray-level co-occurrence matrix (GLCM) texture metrics used for spatial analysis on raster data.
https://ailich.github.io/GLCMTextures/
GNU General Public License v3.0
12 stars 4 forks source link

Question about the possibility of adding Sum Average texture #26

Closed filrat2 closed 9 months ago

filrat2 commented 9 months ago

Hello, thank you for your hard work on this package. Would it be possible to add more of the textures suggested by Haralick et al.? In my case, I need a Sum Average texture, but no R package currently offers this feature.

ailich commented 9 months ago

Hi @filrat2. I only implemented measures from Dr. Mryka Hall-Beyer’s GLCM texture tutorial since it allowed me to check if my calculations were correct. The notation in the original Haralick paper is a bit different and I'm not quite sure how px+y(k) and px+y(i) fit in. Would you be able to provide R code showing what "Sum Average" does?

For example R code just use a single test matrix like below. If we get the correct formula, I should be able to set up the iteration across the raster fairly easily. Below I have an example for GLCM Mean.

Screenshot 2023-11-13 164525

library(GLCMTextures)
#> Loading required package: terra
#> Warning: package 'terra' was built under R version 4.3.1
#> terra 1.7.47
n_levels<- 4
test_matrix<- matrix(data=c(2,0,1,3,0,0,0,3,2), nrow = 3, ncol=3)
test_matrix
#>      [,1] [,2] [,3]
#> [1,]    2    3    0
#> [2,]    0    0    3
#> [3,]    1    0    2

P_ij<- make_glcm(test_matrix, n_levels = n_levels, shift = c(1,0)) # GLCM with horizontal neighbors

#Calculate GLCM Mean With Loop
glcm_mean1<- 0 #initialize at zero
for (i in 0:(n_levels-1)) {
  for (j in 0:(n_levels-1)) {
    glcm_mean1<- glcm_mean1 + (i*P_ij[i+1,j+1])
  }
}
glcm_mean1
#> [1] 1.166667

#Calculate GLCM Mean Without Loop
i_mat<- matrix(data= 0:(nrow(P_ij)-1), nrow= nrow(P_ij), ncol=ncol(P_ij), byrow=FALSE)
glcm_mean2<- sum(i_mat*P_ij)
glcm_mean2
#> [1] 1.166667

Created on 2023-11-13 with reprex v2.0.2

filrat2 commented 9 months ago

I'm not good enough to show how Sum Average should work. However, textures were already implemented in R, but these packages have not been supported in years (I found radiomics and wvtools). It is still possible to see the code of those packages on GitHub. For example,  you can check radiomics SumAverage algorithm here.

#' @describeIn glcm_features Sum Average
#' 
glcm_sumAverage <- function(glcm){
  sum <- 0
  for(i in 1:(2*nrow(glcm))){
    #bit of a sticky situation with the extra 1s....
    sum <- sum + (i+1)*glcm_xplusy(glcm, i-1)
  }
  return(sum)
}

This texture was also used in open-source GIS programs like GRASS GIS and Orfeo ToolBox. This software's source code is also available on GitHub. Here you can check the source code of the GRASS GIS r.texture algorithm.

/* Sum Average */
float f6_savg(void)
{
    int i, j, k;
    float savg = 0;
    float *P = Pxpys;

    /*
       for (i = 0; i < 2 * Ng - 1; i++)
       savg += (i + 2) * Pxpys[i];
     */

    for (i = 0; i < Ng; i++) {
        for (j = 0; j < Ng; j++) {
            k = i + j;
            savg += (tone[i] + tone[j]) * P[k];
        }
    }

    return savg;
}

Hopefully it will be helpful for you.

ailich commented 9 months ago

I think I understand Sum Average now, but I want to confirm that this follows your understanding of what the measure should do. I conducted these steps.

  1. Tabulate the GLCM and normalize to probabilities
  2. From the GLCM tabulate how frequently the grey levels of the focal and neighboring pixel sum to a certain value (k_prob)
  3. Multiply these probabilities (k_prob) by the corresponding sum (k) and then sum the results

@filrat2, does this seem correct to you? I consulted the original paper and this link for determining this procedure.

library(GLCMTextures)
#> Loading required package: terra
#> Warning: package 'terra' was built under R version 4.3.1
#> terra 1.7.47

n_levels<- 4
test_matrix<- matrix(data=c(2,0,1,3,0,0,0,3,2), nrow = 3, ncol=3)
test_matrix
#>      [,1] [,2] [,3]
#> [1,]    2    3    0
#> [2,]    0    0    3
#> [3,]    1    0    2

P_ij<- make_glcm(test_matrix, n_levels = n_levels, shift = c(1,0)) # GLCM with horizontal neighbors

# i = the row-wise index of the GLCM (equal to gray level of reference cell)
# j = column indices of the GLCM matrix (equal to gray level of neighboring cell)
# k = i+j
# k_prob is the probability that i and j sum to that value of k

k_prob<- rep(0, 2*n_levels-1) #Initialize at zero
names(k_prob)<- 0:(2*(n_levels-1))
for (R in 1:nrow(P_ij)) {
  for (C in 1:ncol(P_ij)) {
    i<- R-1
    j<- C-1
    k<- i+j
    k_prob[k+1]<- k_prob[k+1] + P_ij[R,C] #add probability to corresponding spot in k_prob
  } # i and j start at 0, but R is 1 indexed so have +/- 1 in certain spots
}

k_prob
#>         0         1         2         3         4         5         6 
#> 0.1666667 0.1666667 0.1666667 0.3333333 0.0000000 0.1666667 0.0000000
k<- 0:((n_levels-1)*2)

SumAverage<- sum(k*k_prob)
SumAverage
#> [1] 2.333333

Created on 2023-11-15 with reprex v2.0.2

filrat2 commented 9 months ago

To be honest, it looks good and logical, but I can't assess if this works according to the method proposed by Haralick in his paper. It is hard to say because we do not have any examples of using this formula with input and output data to compare results (something like examples from Dr. Mryka Hall-Beyer’s tutorial). I don't have the knowledge to assess the work of your code. I know only examples of applications of this texture (from papers) and software where those methods have already been implemented.

ailich commented 9 months ago

@filrat2, I compared with radiomics and I get a different result for sum average, but I believe it is due to a bug in their code. My calculations match the result of their function glcm_xplusy. Looking at the internals of their function to calculate sum average, I notices they loop through a vector of the incorrect length and multiply by 1 greater than the grey value which seem incorrect. I With some minor tweaks to their code, the result matches mine. I will do a bit more to verify my calculations and can hopefully add the measure to the package soon.

library(GLCMTextures)

n_levels<- 4
test_matrix<- matrix(data=c(2,0,1,3,0,0,0,3,2), nrow = 3, ncol=3)
test_matrix

P_ij<- make_glcm(test_matrix, n_levels = n_levels, shift = c(1,0)) # GLCM with horizontal neighbors
k_prob<- rep(0, 2*n_levels-1) #Initialize at zero
names(k_prob)<- 0:(2*(n_levels-1))
for (R in 1:nrow(P_ij)) {
  for (C in 1:ncol(P_ij)) {
    i<- R-1
    j<- C-1
    k<- i+j
    k_prob[k+1]<- k_prob[k+1] + P_ij[R,C] #add probability to corresponding spot in k_prob
  } # i and j start at 0, but R is 1 indexed so have +/- 1 in certain spots
}
k_vals<- 0:((n_levels-1)*2)
SumAverage<- sum(k_vals*k_prob)
SumAverage

#With radiomics
source("radiomics.R") #Could not install package so put functions in script
P_ij2<- glcm(test_matrix, angle = 0, n_grey = 4, normalize = TRUE)
P_ij==P_ij2
# 0    1    2    3
# 0 TRUE TRUE TRUE TRUE
# 1 TRUE TRUE TRUE TRUE
# 2 TRUE TRUE TRUE TRUE
# 3 TRUE TRUE TRUE TRUE

calc_features(object = P_ij2, features = "glcm_sumAverage")
# glcm_sumAverage
# 1        4.333333

k_prob2<- rep(NA_real_, times = 2*n_levels-1)
names(k_prob2)<- 0:(2*(n_levels-1))
for (k in 0:(length(k_prob2)-1)) {
  k_prob2[k+1]<- glcm_xplusy(P_ij2, k=k)
}

k_prob==k_prob2
# 0    1    2    3    4    5    6 
# TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
# My code matches results of glcm_xplusy

#radiomics Sum Average Code Line by Line
sum <- 0
for(i in 1:(2*nrow(P_ij2))){
  #bit of a sticky situation with the extra 1s....
  sum <- sum + (i+1)*glcm_xplusy(P_ij2, i-1)
}
sum #Does not match my result
# 4.333333

#They iterate through 1:(2*nrow(P_ij2))
length(1:(2*nrow(P_ij2)))
# 8
#This should only be a vector of length 7 not 8 since the maximum sum of gray values 0-3 is 6 and therefore k can range from 0 to 6.
# It seems like it should be (i-1) in both spots sice the gray levels start at value 0.

# My edit of radiomics
sum <- 0
for(i in 1:(2*nrow(P_ij2)-1)){
  #bit of a sticky situation with the extra 1s....
  sum <- sum + (i-1)*glcm_xplusy(P_ij2, i-1)
}
sum # Matches my result
# 2.333333
filrat2 commented 9 months ago

I'm very grateful for your work.

It is quite interesting, because a few days ago I found another package called ForestTools, which uses algorithms from radiomics. So if the radiomics code has a bug, the ForestTools package in function glcm also has it.

When you add this measure to the package, will it be possible to compute this texture for each pixel in a spatial raster like a satellite image? I mean Calculate Texture Metrics Raster Surfaces for Sentinel-1 or Sentinel-2 data.

Nowosad commented 9 months ago

It is quite interesting, because a few days ago I found another package called ForestTools, which uses algorithms from radiomics. So if the radiomics code has a bug, the ForestTools package in function glcm also has it.

Maybe it would be good to inform that package authors (CC: @andrew-plowright)

ailich commented 9 months ago

@Nowosad, I'll get on that!

ailich commented 9 months ago

@filrat2, yup I plan to implement it to give a value for each raster pixel once we confirm the code is correct. Them I just need to translate that R code to C++ code to add it in with the rest of the measures.

andrew-plowright commented 9 months ago

Thanks @ailich and @filrat2 for looking into this.

Given that the radiomics package is no longer maintained, I had spoken with its author and asked if I could move the C++ code that he developed into ForestTools to ensure that it could live on even if radiomics was taken down from CRAN.

ForestTools itself is meant to implement a specific use of GLCM, which is the classification of trees. I think it makes more sense to have the foundational GLCM computations stored in a different package, which seems to be what you are doing.

One of ForestTools' functionalities is to compute GLCMs for each segment of a segmented image. Instead of discretizing each segment individually, the entire image is discretized beforehand.

Are you intending on having the discretization/quantization process separate from the GLCM computation?

ailich commented 9 months ago

@andrew-plowright, the quantize_raster can be used to discretize the entire image to a set number of gray levels based on bins of equal range based on the min and max values, or based on quantiles which will create the range for the breaks so that an approximately equal number of cells are assigned to each gray level. quantize_raster can be done separately or if you specify a quantization method in glcm_textures it will do that for you before calculating the textures.

ailich commented 9 months ago

@andrew-plowright, do you happen to know if that is indeed an error in the radiomics code for GLCM Sum Average?

ailich commented 9 months ago

@filrat2, I added GLCM Sum Average to the glcm_textures function so it can now be used across raster data. I believe the way I have it done is correct for data quantized starting at zero (which is slightly different than the original Haralick formulas that use gray levels starting at 1), and is different from the radiomics code which I believe has a mistake in it.

filrat2 commented 8 months ago

@ailich Thank you so much for the work you do. I've already tested this function, and everything looks to be in order. I also used a parallel function from another issue, and it works too.

ailich commented 8 months ago

@filrat2, awesome! I'm putting the finishing touches on the parallel stuff, but the code in that issue should work for most cases. Hopefully soon that'll be implemented directly into the package.