r-lidar / lasR

Fast and Pipable Airborne LiDAR Data Tools
https://r-lidar.github.io/lasR/
GNU General Public License v3.0
7 stars 1 forks source link

geometry_features() with filter keep_ground() does not work #45

Closed wiesehahn closed 7 months ago

wiesehahn commented 7 months ago

I tried to calc some geometric features and experienced that this works without prior filtering and also when dropping ground points, but keeping just ground points it does not work. R simply processes forever.

library(lasR)

f <- system.file("extdata", "MixedConifer.las", package="lasR")

geometrics <- function(pipeline, data){  
  out <- exec(pipeline, on = data)  
  return(out)
}

# works
pipeline = geometry_features(k = 15 , features = "lps") + write_las()
ans <- geometrics(pipeline, f)

# does not work
pipeline = delete_points(filter = keep_ground()) + geometry_features(k = 15 , features = "lps") + write_las()
ans <- geometrics(pipeline, f)

# works
pipeline = delete_points(filter = drop_ground()) + geometry_features(k = 15 , features = "lps") + write_las()
ans <- geometrics(pipeline, f)
wiesehahn commented 7 months ago

while it works when dropping ground for the testfile above, I just experienced that this also does not work for one of my files...

Jean-Romain commented 7 months ago

It would be much more efficient to do (and it works)

pipeline = reader_las(filter = keep_ground()) + geometry_features(k = 15 , features = "lps") + write_las()
ans <- geometrics(pipeline, f)

Yet, your use case is valid. I confirm the problem.

wiesehahn commented 7 months ago

Ah yes, because the reader is able to read the header and then only these parts of the data?

Would it, just in theory, be possible to evaluate the pipeline on execution and then automatically apply an appropriate filter to read only necessary data?

Jean-Romain commented 7 months ago

The reader_las stage reads what needs to be read. If you want your entire pipeline to apply exclusively on ground points, then you can only read the ground points. delete_points can be used to remove points at some point of the pipeline but I'm expecting it to have little use actually because most stage have already a filter integrated to process some points of interest (which is not the case of geometry_features).

Btw with this bug you found a big issue with geometry_feature. It processes all the points (not only the non-deleted ground points, but also all the deleted ones). In addition it takes a lot of time at finding the neighbors because it is not aware of all the points that were removed so it constantly finds deleted points in the local neighborhood and cannot easily find k non-deleted neighbors. Big issue. Need some redesign.

Jean-Romain commented 7 months ago

I made some modifications

  1. I fixed the bug. geometry_features was looping on all the points (including those removed in the previous stage) to compute the SVD decomposition. In addition it was looking for the knn in a local neighborhood as if the points were not removed. Consequently it was finding a lot of points flagged as 'deleted' (in lasR the removed point are flagged but stay in memory for performance reasons) and it was not able to find the legit ground points fast. This is fixed. It now loops and compute the SVD only on the retained points and the knn search is optimized to avoid this pitfall.

  2. In addition I modified the delete_points stage such as it can also reallocate the memory layout. For performance reasons the points are flagged as deleted but not removed. Flagged points are simply skipped by all stages. If a lot of points are removed (like in your case where 85% of the points are removed) it worth it to remap the memory layout and actually remove the points. delete_point can now do that and release memory if the amount of memory to release is significant.

  3. Fixed a similar issue of processing deleted points that affected at least local_maximum and probably rasterize and transform_with

PS: please open an issue for feature request about what we talked about few days ago. Something about local maximum and local metrics. Your help is invaluable for the development of lasR. I own you that if it can help you. Please describe accurately what you are expecting and how you think it should work.

wiesehahn commented 6 months ago

delete_points can be used to remove points at some point of the pipeline but I'm expecting it to have little use actually because most stage have already a filter integrated to process some points of interest (which is not the case of geometry_features).

Is there a reason the filter is not implemented for geometry_features?

Lets say I want to iteratively classify points on linearity, remove points with low likelyhood and classify again based on remaining points (something like pipeline = geometry_features(k = 15, r = 3, features = "l") + geometry_features(k = 15, r = 3, features = "l", filter = "-keep_linearity_above 0.5")) I would do this with an intermediate + delete_points(filter = "-keep_linearity_above 0.5") +, correct? What is the correct way here to filter on this added attribute? -keep_attribute_above 0 0.5 to access the first [0] added attibute?

Jean-Romain commented 6 months ago

Is there a reason the filter is not implemented for geometry_features?

Excellent question. I forgot to talk about that in my previous message. Yes there is a reason. Imagine you do

reader_las() + geometry_features(k = 15 , features = "lps", filter = keep_ground()) + write_las()

You want to read all the points but apply the geometry computation only on ground points. Thus, ground points will gain valid geometric values and other points will gain NAs. So far so good. Now, for somebody else reading the file, how can this person understand the meaning of these values. Are the geometry features computed for ground points only with all points? Are the geometry features computed for ground points only with ground points only? I think it is confusing. That why this stage does not have a filter and thus no NAs, no problem.

Now, your use case is interesting and reaches the limits of the possibilities. delete_points() with -keep_attribute_above 0 0.5 should work if you are sure that linearity is the extrabyte 0. You can't be sure because lps is not ordered in the order of the flags. The order is hard coded, the flag only enable or disable the feature. Also the second call to geometry_feature() will create a second attribute, not override the original one.

I'm opened to suggestions.

wiesehahn commented 6 months ago

Yeah you are right, this might be confusing

Now, your use case is interesting and reaches the limits of the possibilities. delete_points() with -keep_attribute_above 0 0.5 should work if you are sure that linearity is the extrabyte 0. You can't be sure because lps is not ordered in the order of the flags. The order is hard coded, the flag only enable or disable the feature. Also the second call to geometry_feature() will create a second attribute, not override the original one.

In my use case above the only feature is l so for this case -keep_attribute_above 0 0.5 should work, correct? Otherwise, where can I see the order attributes are written?

The names of the extrabytes attributes (if recorded) are coeff00, coeff01, coeff02 and so on, lambda1, lambda2, lambda3, anisotropy, planarity, sphericity, linearity, omnivariance, curvature, eigensum, angle, normalX, normalY, normalZ.

Is it this?

Jean-Romain commented 6 months ago

In my use case above the only feature is l so for this case -keep_attribute_above 0 0.5 should work, correct?

Absolutely

Otherwise, where can I see the order attributes are written?

Excellent question. I'm checking in the source code... and yes this is correct. I added some notes in the documentation to make it clear.