r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
582 stars 130 forks source link

Vertical stratification of the point cloud #768

Open spono opened 1 month ago

spono commented 1 month ago

Ciao JR,

I was thinking that a possible small (?) enhancement could be to add an option to segment_trees for a further “refinement” of the segmentation (e.g. opt_refine).

The idea could be to apply a height threshold to the single tree-cluster, splitting it into over- and understory according to the binned density of points along the tree height. In literature, such an approach is reported in “different flavours” from Barilotti et al., 2007 (minimum peak on a polynomial of 6th order) , Yun et al., 2022 (minimum peak on a polynomial of 9th order) and Hamraz et al.,2016 (minimum peaks on the second derivative of the smoothed histogram).

I guess the simplest solution could be drafted like this:

if (grepl("barilotti", tolower(opt_refine)))
{
    dh <- hist( z, breaks=seq(0,ceiling(max(z)),1), plot=FALSE)
    md = data.frame( x = seq(1,ceiling(max(z)),1), y = dh$counts )
    limo = lm(y ~ poly(x, 6), data=md)
    h.thr = as.numeric(which.min(limo$fitted.values))

    # if the threshold is lower than the apex and bigger than the minimum height for 
    if (h.thr < max(z) & max(z[z<h.thr]) > hmin)
    { 
      las@data[Classification != 2L & Z < h.thr, Classification := 4L ] # assign to medium vegetation
      las@data[Classification == 4, treeID := NA_integer_ ]
    }
}

I thought its place could be in the segment_trees algorithm (line 39 ?) but, looking a bit better at the code, it would need a looping through treeIDs and probably this might happen within metrics_template...I don't know. Actually the process can be achieved through a specific function to be run on a catalog but I think an integration might favour a reduction of the processing steps and data copies.

Another thing to decide, could be if assigning a treeID to the newborn understorey cluster:

  1. assignment by default (all clusters are considered new trees...but it would surely lead to over segmentation and requires a second step of ID assignment for the understory);
  2. leaving to the user the freedom to repeat the segmentation process over a CHM created from the new medium vegetation points. Even tough it seems to me a bit more complicated to deal with, with the available tools.

Maybe, a conditional assignment could be the best compromise (i.e. if the zmax of the understorey cluster is above hmin=2, then can be considered another tree, otherwise is just not defined understorey). As a side-outcome, the pointcloud gets also classified for further possible processing.

I know you want only published methods in lidR and the ones above probably look as half-methods. Nevertheless, I was willing to open the discussion on it before even trying to start coding something that can go beyod my skills.

Jean-Romain commented 1 month ago

Why not. Would you be able to write a minimal reproducible example so I can understand better what we are talking about.

According to your source code I understand that for each tree we are looking at the vertical point density. At the lowest density point (somewhere between the ground and the crown I guess) we have a cutoff. Everything below the cutoff is assigned the class medium classification and the tree ID is removed.

I understand that this idea has been applied by multiple authors and I guess the way to find the height cutoff differs slightly between authors.

spono commented 1 month ago

exactly: a mature stand with understory is often connected to a bi-/multi-modal distribution of points, across the height profile. All of the methods above pass through this step of detecting the local minimum to help separate the lowest points from the single tree cluster in order to have "cleaner" statistics and, in case, improve the segmentation of dominated trees.

The most basic method is the one from Barilotti, drafted below...but using it within a sliding window like the one in your multi-scale prediction (I read about it but now I don't find anymore the reference page...) it should not be too difficult to get to the one proposed by Yun (I guess).

library(lidR)

LASfile <- system.file("extdata", "MixedConifer.laz", package="lidR")
las <- readLAS(LASfile, select = "xyzc", filter = "-inside 481250 3812980 481300 3813030")
chm = rasterize_canopy( las, res = 0.5, p2r(subcircle=0.2))

ttops = structure(list(treeID = 1:29, Z = c(16, 26.95, 23.58, 19.75, 
18.89, 16.25, 19.96, 23.7, 24.61, 23.28, 24.23, 10.23, 28.09, 
22.46, 26.59, 23.71, 23.64, 25.95, 16.63, 24.14, 4.24, 25.17, 
23.91, 24.64, 23.43, 27.57, 24.08, 24.37, 25.24), geometry = structure(list(
    structure(c(481294.68, 3813010.76, 16), class = c("XYZ", 
    "POINT", "sfg")), structure(c(481281.89, 3813003.24, 26.95
    ), class = c("XYZ", "POINT", "sfg")), structure(c(481278.38, 
    3813002.49, 23.58), class = c("XYZ", "POINT", "sfg")), structure(c(481265.29, 
    3812996.07, 19.75), class = c("XYZ", "POINT", "sfg")), structure(c(481261.81, 
    3812999.35, 18.89), class = c("XYZ", "POINT", "sfg")), structure(c(481266.91, 
    3813003.83, 16.25), class = c("XYZ", "POINT", "sfg")), structure(c(481273.54, 
    3813000.5, 19.96), class = c("XYZ", "POINT", "sfg")), structure(c(481275.78, 
    3813008.99, 23.7), class = c("XYZ", "POINT", "sfg")), structure(c(481278.25, 
    3813010.64, 24.61), class = c("XYZ", "POINT", "sfg")), structure(c(481284.93, 
    3812997.9, 23.28), class = c("XYZ", "POINT", "sfg")), structure(c(481289.43, 
    3813008.25, 24.23), class = c("XYZ", "POINT", "sfg")), structure(c(481281.2, 
    3812980.06, 10.23), class = c("XYZ", "POINT", "sfg")), structure(c(481281.5, 
    3812988.74, 28.09), class = c("XYZ", "POINT", "sfg")), structure(c(481281.67, 
    3813010.64, 22.46), class = c("XYZ", "POINT", "sfg")), structure(c(481274.48, 
    3812981.78, 26.59), class = c("XYZ", "POINT", "sfg")), structure(c(481270.48, 
    3812994.69, 23.71), class = c("XYZ", "POINT", "sfg")), structure(c(481269.29, 
    3812980.39, 23.64), class = c("XYZ", "POINT", "sfg")), structure(c(481267.52, 
    3812988.86, 25.95), class = c("XYZ", "POINT", "sfg")), structure(c(481266.56, 
    3813000.92, 16.63), class = c("XYZ", "POINT", "sfg")), structure(c(481264.94, 
    3813008.66, 24.14), class = c("XYZ", "POINT", "sfg")), structure(c(481260.81, 
    3812983.56, 4.24), class = c("XYZ", "POINT", "sfg")), structure(c(481287.67, 
    3812986.83, 25.17), class = c("XYZ", "POINT", "sfg")), structure(c(481288.38, 
    3812990.66, 23.91), class = c("XYZ", "POINT", "sfg")), structure(c(481288.64, 
    3812996.93, 24.64), class = c("XYZ", "POINT", "sfg")), structure(c(481291.51, 
    3812990.03, 23.43), class = c("XYZ", "POINT", "sfg")), structure(c(481296.37, 
    3812983.31, 27.57), class = c("XYZ", "POINT", "sfg")), structure(c(481293.65, 
    3812998.81, 24.08), class = c("XYZ", "POINT", "sfg")), structure(c(481297.78, 
    3812992.22, 24.37), class = c("XYZ", "POINT", "sfg")), structure(c(481299.65, 
    3812996.57, 25.24), class = c("XYZ", "POINT", "sfg"))), n_empty = 0L, precision = 0, crs = structure(list(
    input = "EPSG:26912", wkt = "PROJCRS[\"NAD83 / UTM zone 12N\",\n    BASEGEOGCRS[\"NAD83\",\n        DATUM[\"North American Datum 1983\",\n            ELLIPSOID[\"GRS 1980\",6378137,298.257222101,\n                LENGTHUNIT[\"metre\",1]]],\n        PRIMEM[\"Greenwich\",0,\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        ID[\"EPSG\",4269]],\n    CONVERSION[\"UTM zone 12N\",\n        METHOD[\"Transverse Mercator\",\n            ID[\"EPSG\",9807]],\n        PARAMETER[\"Latitude of natural origin\",0,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8801]],\n        PARAMETER[\"Longitude of natural origin\",-111,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8802]],\n        PARAMETER[\"Scale factor at natural origin\",0.9996,\n            SCALEUNIT[\"unity\",1],\n            ID[\"EPSG\",8805]],\n        PARAMETER[\"False easting\",500000,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8806]],\n        PARAMETER[\"False northing\",0,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8807]]],\n    CS[Cartesian,2],\n        AXIS[\"(E)\",east,\n            ORDER[1],\n            LENGTHUNIT[\"metre\",1]],\n        AXIS[\"(N)\",north,\n            ORDER[2],\n            LENGTHUNIT[\"metre\",1]],\n    USAGE[\n        SCOPE[\"Engineering survey, topographic mapping.\"],\n        AREA[\"North America - between 114°W and 108°W - onshore and offshore. Canada - Alberta; Northwest Territories; Nunavut; Saskatchewan. United States (USA) - Arizona; Colorado; Idaho; Montana; New Mexico; Utah; Wyoming.\"],\n        BBOX[31.33,-114,84,-108]],\n    ID[\"EPSG\",26912]]"), class = "crs"), bbox = structure(c(xmin = 481260.81, 
ymin = 3812980.06, xmax = 481299.65, ymax = 3813010.76), class = "bbox"), class = c("sfc_POINT", 
"sfc"))), row.names = c(NA, 29L), sf_column = "geometry", agr = structure(c(treeID = 1L, 
Z = 1L), class = "factor", levels = c("constant", "aggregate", 
"identity")), class = c("sf", "data.frame"))

las = segment_trees( las, dalponte2016(chm, ttops))

ss = filter_poi(las, treeID == 28)
plot(ss, color="Classification")

hmin = 2
z = ss@data$Z

dh <- hist( z, breaks=seq(0,ceiling(max(z)),1), plot=FALSE)
md = data.frame( x = seq(1,ceiling(max(z)),1), y = dh$counts )
limo = lm(y ~ poly(x, 6), data=md)
h.thr = as.numeric(which.min(limo$fitted.values))

# if the threshold is lower than the apex and bigger than the minimum height for tree detection
if( h.thr < max(z) & max(z[z<h.thr]) > hmin){ 
  ss@data[Classification != 2L & Z < h.thr, treeID := NA_integer_] # set understorie's treeID to NA
}

The last step could be improved assigning a classification code, as follows:

if( h.thr < max(z) & max(z[z<h.thr]) > hmin){ 
  ss@data[Classification != 2L & Z > h.thr, Classification := 5L ] # assign to high vegetation
  ss@data[Classification != 2L & Z <= h.thr, Classification := 4L ] # assign to medium vegetation
  ss@data[Classification == 4, treeID := NA_integer_ ]
}
plot(ss, color="Classification")

let's say that classifying the points helps also in the case:

  1. the cloud is already classified (not the case in the example above)
  2. the user wants to create an OHM of the understory.
Jean-Romain commented 1 month ago

What is an OHM?

What would be the API for such feature?

segment_tree(las, algo1, refined = algo2)

or

refine_tree_segmentation(las, algo2)
spono commented 1 month ago

What is an OHM?

sorry, just a typo

What would be the API for such feature?

I guess the first one would be better in order to reduce the processing steps of a basic analysis and related data copies.