r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
582 stars 130 forks source link

Improve eigen_metrics: Return coeff as well as eigenvalues. #720

Closed TedwardErker closed 10 months ago

TedwardErker commented 10 months ago

The eigen_metrics function is a true wonder for its speed and usefulness. One of its limitations is that it does not return the coefficient matrix which would give information about orientation/direction. The various shape metrics (LAS::filter_shape) use the coefficient matrix to output whether a point matches the shape or not, this does allow for finding vertical and horizontal orientations. One downside of the shape_metrics approach is that the user must pass parameters to the function defining the shape ("th"). It is not always clear what value(s) of "th" is(are) optimal. Users might also be interested in other orientations (e.g. branches that have 45 degree angles, hedgerows that run north-south, calculating what pitch a roof has for estimating its ability to shed snow etc).

An alternative approach to the shape metrics is to output the coefficient matrix and then use a supervised classification approach to learn the relationships between the eigenvalues and coefficient matrix and the shapes. Rather than a user passing the threshold value, "th", the user could train an algorithm to identify that threshold, or simply use the coefficient matrix as features in a classification problem.

What I'm envisioning is something like this:

eigen_w_coeff <- function(x, y, z) 
{
   xyz <- cbind(x, y, z)
   o <- lidR:::fast_eigen_values(xyz)
   as.list(unlist(o))
}

pm <- point_metrics(las, ~eigen_w_coeff(X, Y, Z), k = 15, r = 8, xyz = T)
lpm <- LAS(pm)

The coefficients output contain valuable information about direction. This uses the C++ decomposition, but it doesn't run on multiple CPUs, so it is considerably slower than the eigen_metrics approach.

Suggested enhancement: modify the eigen_metrics function to output coefficients as well as eigenvalues.

I don't know C++ and so don't feel that a pull request from me would be very high quality. I think the primary modification would need to be in the function on line 1501 of LAS.cpp, LAS::eigen_decomposition. Rather than just returning the eigenvalues, this should also return the 9 coefficients. I don't have great suggestions for the names.

I see the primary application of this in separating out built features from the point cloud. For example, vertical poles, roofs, power lines etc. Here are a couple examples of the coefficient's being plotted:

roof_lines

wall_roof

There is a lot of information here.

The absolute value of coeff7 give the verticality of the points

lpm@data$vert <- abs(lpm$coeff7)

verticality

Find points in a plane

lpm@data$planar <- (lpm$eigen2 - lpm$eigen3) / lpm$eigen1

planar

Find points in a vertical plane:

lpm@data$vertplane <- lpm$vert * lpm$planar

vertplane

The exact value of "vertplane" that corresponds to a wall can then be tested or estimated.

In my opinion, this can give a bit more flexibility to users. And when fed into a supervised classification algorithm, I think the coefficients can be very useful. It's already possible to get these with the lidR::fast_eigen_values() function, but it would be great to get the full speed up by modifying eigen_metrics to return the coefficients.

Thanks for developing this excellent package.

Jean-Romain commented 10 months ago

I did it quickly in 3b7d705. It is available in the devel branch.

LASfile <- system.file("extdata", "Topography.laz", package="lidR")
las = readLAS(LASfile, filter = "-thin_with_grid 2")
M <- point_eigenvalues(las, k = 25, coeffs = T)

Please check that the values have meaningful names and are in a meaningful order. Compare to your function. I did it quickly lying on the couch, so a second review might worth it.

TedwardErker commented 10 months ago

Thanks for the fast response and change. I had trouble testing right away this because of issues installing the package from github on my mac. Once I got it running appears to work!

The columns get output in a different order that they do in the function I have above, just something to note. The values are identical though.

I think the column names as you have them to indicate row and column is good. Maybe eventually they could be changed to indicate the dimension (X, Y, Z) and the eigenvector (1, 2, 3) so they would be coeff_X1, coeff_X2, .... coeff_Z3. I'm not knowledgable enough to recommend this with confidence or know which is which for sure.

On the test las file I had, the time with the point_metrics and "eigen_w_coeff" function from above was 209 seconds (running on 1 thread). point_eigenvalues with coeff = T took 8.3 seconds (running on 8 threads). damn. That is an insane speed up.

Jean-Romain commented 10 months ago

Ok you used as.list but I ordered them manually so order might be different indeed. It could be rather coeffXX, coeffXY, coeffYZand so on in this case. Well I'll investigate to find a convention for names.