r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
596 stars 131 forks source link

classify_noise() potential bug? #408

Closed adrienmichez closed 3 years ago

adrienmichez commented 3 years ago

Following your advise on gis.stackexchange.com, I created this issue.

I'm experiencing a strange behavior with classify_noise() with one of my files (work fine with most of them). The original file can be found here:

library(lidR)

setwd('D:/RENNES/DATA_GIS/TMP/')
las_1 <- readALSLAScatalog('./157668.5_243478.3.las')
opt_output_files(las_1) <- paste0("./", "{XCENTER}_{YCENTER}")
opt_chunk_size(las_1) <- 250
opt_chunk_buffer(las_1) <- 20

# launch denoising process
las2 <- classify_noise(las_1 , algorithm = ivf())

# las2 and las1 has different number of points!
las_1
# class       : LAScatalog (v1.4 format 6)
# extent      : 157007.9, 158329.1, 242883.8, 244072.9 (xmin, xmax, ymin, ymax)
# coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
# area        : 1.57 km²
# points      : 129.41million points
# density     : 82.4 points/m²
# num. files  : 1 

las2
# class       : LAScatalog (v1.4 format 6)
# extent      : 157007.9, 158000, 242883.8, 244072.9 (xmin, xmax, ymin, ymax)
# coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
# area        : 0.94 km²
# points      : 72.35million points
# density     : 77.4 points/m²
# num. files  : 22 

I found the that classify_noise() was weirdly removing part of catalog... A significant part of the original point cloud is missing. Did I miss something? For other lasfile, it' s working as expected. The original las file pass the las_check() without reported issues:

las_check(las_1)
# Checking headers consistency
# - Checking file version consistency... ✓
# - Checking scale consistency... ✓
# - Checking offset consistency... ✓
# - Checking point type consistency... ✓
# - Checking VLR consistency... ✓
# - Checking CRS consistency... ✓
# Checking the headers
# - Checking scale factor validity... ✓
# - Checking Point Data Format ID validity... ✓
# Checking preprocessing already done 
# - Checking negative outliers... ✓
# - Checking normalization... no
# Checking the geometry
# - Checking overlapping tiles... ✓
# - Checking point indexation... yes

My versions of lidR and rlas are up to date:

packageVersion("rlas")
#  [1] ‘1.3.9’
packageVersion("lidR")
# [1] ‘3.1.1’

The compressed file is here. After compression of the orginal las file and loading the compressed file, classify_noise() seems to work just as expected. I guess there is an explanation but I really don't see it!

las_1 <- readLAS('./157668.5_243478.3.las')
writeLAS(las_1, './157668.5_243478.3.laz')
laz_1 <- readALSLAScatalog('./157668.5_243478.3.laz')
opt_output_files(laz_1) <- paste0("./", "{XCENTER}_{YCENTER}_2")
opt_chunk_size(laz_1) <- 250
opt_chunk_buffer(laz_1) <- 20
laz2 <- classify_noise(laz_1 , algorithm = ivf())
laz2
# class       : LAScatalog (v1.4 format 6)
# extent      : 157007.9, 158329.1, 242883.8, 244072.9 (xmin, xmax, ymin, ymax)
# coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
# area        : 1.57 km²
# points      : 129.41million points
# density     : 82.4 points/m²
# num. files  : 36 
adrienmichez commented 3 years ago

The link to the files will expire on 23/02/2021

Jean-Romain commented 3 years ago

This is the fourth time I try to download your file in two days and it always fails before the end. Please try to reproduce yourself following what I would like to test

  1. Just retry your example in a clean session to ensure you did not make a mistake somewhere.
  2. If you can reproduce your issue, then remove the lax file and retry with a new lax file generated by lasindex from LAStools
  3. If you can still reproduce after (2), delete the lax file and retry.

This is what I would try on my side if I could.

adrienmichez commented 3 years ago
  1. still not working
  2. same behavior with lax file from lasindex or lidR:::catalog_laxindex()
  3. working fine without lax.

The original las file is the result of clip_rectangle() made over a catalog of 4 laz files from the US LiDAR DB:

The coordinates of the rectangle are (CRS from the laz files):

st_bbox(selected_reach)
#    xmin     ymin     xmax     ymax 
# 157007.9 242883.8 158329.1 244072.9
Jean-Romain commented 3 years ago

So my guess was correct and we have a guilty. It means that something went wrong with the spatial index and the queries. I should be able to reproduce with your laz file. My internet connection at home is not stable enough and I'm not able to download the las file that is too big. I hope I will have more chance with the laz.

Jean-Romain commented 3 years ago

So it failed again and again and again even with a wire connection. I can't download the file and I can't do anything

adrienmichez commented 3 years ago

I cannot reproduce the bug with the laz file neither with thinned las. But I did reproduce the bug while starting from the four original las files and creating a new one using clip_rectangle.

Have you tried to download the original 4 laz files from US ftp? It will be probably much more stable using dedicated software like filezila ou uget?

Jean-Romain commented 3 years ago

No, I'm working at home. I don't have an illimited internet connection. I already spent GB of data trying to download your files. This is why I wanted you to send compressed file at the beginning. I'll retry later when I'll be sure to do not exceed my monthly internet plan.

adrienmichez commented 3 years ago

Ok... Sorry for you internet plan.

I really don't understand why when I compressed or thin the file the problem is no longer occuring

Jean-Romain commented 3 years ago

Because it is no longer associated to the lax file I guess.

adrienmichez commented 3 years ago

No... Because even if I add a lax file to the thinned las, classify_noise is still working as expected.

I know it looks like the error comes from me but I can repeat it again and again...

Jean-Romain commented 3 years ago

Please tell me everything you tried.

Jean-Romain commented 3 years ago

So, I went to my university to download your las + lax files and try myself. I'm sorry to tell you that it worked well. In your example the problem seems that it stops before the end after 22/36 chunks. Sadly I cannot reproduce. It works perfectly as expected on my side.

library(lidR)

setwd('~/issue 408/')
las_1 <- readALSLAScatalog('./157668.5_243478.3.las')
opt_output_files(las_1) <- paste0("./", "{XCENTER}_{YCENTER}")
opt_chunk_size(las_1) <- 250
opt_chunk_buffer(las_1) <- 20

# launch denoising process
las_2 <- classify_noise(las_1 , algorithm = ivf())

las_1
#> class       : LAScatalog (v1.4 format 6)
#> extent      : 157007.9, 158329.1, 242883.8, 244072.9 (xmin, xmax, ymin, ymax)
#> coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
#> area        : 1.57 km²
#> points      : 129.41million points
#> density     : 82.4 points/m²
#> num. files  : 1 

las_2
#> class       : LAScatalog (v1.4 format 6)
#> extent      : 157007.9, 158329.1, 242883.8, 244072.9 (xmin, xmax, ymin, ymax)
#> coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
#> area        : 1.57 km²
#> points      : 129.41million points
#> density     : 82.4 points/m²
#> num. files  : 36
adrienmichez commented 3 years ago

Thanks for your efforts...

I re-downloaded the same las you just used, in a brand clean new folder, new R session... And get still the very same behavior!!

Looking at the plot, some tiles are just empty:

image

So frustrating...

Jean-Romain commented 3 years ago

Ok we will try to investigate together. I cannot reproduce so I'll give you some code to run and you will show me the output. This way I will try to spot more accurately where the error could be. But first which version of lidR and rlas are you using.

adrienmichez commented 3 years ago

Thanks!

> packageVersion("rlas")
[1] ‘1.3.9’
> 
> packageVersion("lidR")
[1] ‘3.1.1’
Jean-Romain commented 3 years ago

Ok try that. Please report your results exactly as I did. I'm expecting you to get 0 points in both cases. Maybe you will have something with clip_roi() but it would be very surprising. No matter what your results are, I won't be able to understand the problem but at least we will be able to reproduce without running any long and complex tasks

library(lidR)

setwd('~/issue 408/')
las_1 <- readALSLAScatalog('./157668.5_243478.3.las')
opt_output_files(las_1) <- paste0("./", "{XCENTER}_{YCENTER}")
opt_chunk_size(las_1) <- 250
opt_chunk_buffer(las_1) <- 20

cls = lidR:::catalog_makechunks(las_1)
cl = cls[[23]]
cl
#> class   : LAScluster
#> name    : noname 
#> center  : 158125 , 243625 
#> extent  : 158000 , 158250 , 243500 , 243750 (xmin, xmax, ymin, ymax)
#> extent+ : 157980 , 158270 , 243480 , 243770 (xmin, xmax, ymin, ymax)
#> size    : 290 x 290 
#> files   : 157668.5_243478.3.las 
#> filter  : -inside 157980 243480 158270 243770

las = readLAS(cl)
las
#> class        : LAS (v1.4 format 6)
#> memory       : 745.6 Mb 
#> extent       : 157980, 158270, 243480, 243770 (xmin, xmax, ymin, ymax)
#> coord. ref.  : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
#> area         : 84094.06 m²
#> points       : 8.14 million points
#> density      : 96.84 points/m²

plot(extent(las), add = T)

las = clip_rectangle(las_1, 157980 , 243480, 158270, 243770)
las
#> class       : LAScatalog (v1.4 format 6)
#> extent      : 157980, 158270, 243480, 243770 (xmin, xmax, ymin, ymax)
#> coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
#> area        : 84094.2 m²
#> points      : 8.14 million points
#> density     : 96.8 points/m²
#> num. files  : 1

adrienmichez commented 3 years ago

The tile cls[[23]] is indeed empty...

I also copy-pasted the plot (not sure it could help you)

image

setwd('C:/TMP/lidR_bug/transfer_85074_files_52cf1058/')
las_1 <- readALSLAScatalog('./157668.5_243478.3.las')
opt_output_files(las_1) <- paste0("./", "{XCENTER}_{YCENTER}")
opt_chunk_size(las_1) <- 250
opt_chunk_buffer(las_1) <- 20

cls = lidR:::catalog_makechunks(las_1)
cl = cls[[23]]
cl
# class   : LAScluster
# name    : noname 
# center  : 158125 , 243625 
# extent  : 158000 , 158250 , 243500 , 243750 (xmin, xmax, ymin, ymax)
# extent+ : 157980 , 158270 , 243480 , 243770 (xmin, xmax, ymin, ymax)
# size    : 290 x 290 
# files   : 157668.5_243478.3.las 
# filter  : -inside 157980 243480 158270 243770  

las = readLAS(cl)
las
# class        : LAS (v1.4 format 6)
# memory       : 18.8 Kb 
# extent       : 0, 0, 0, 0 (xmin, xmax, ymin, ymax)
# coord. ref.  : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
# area         : 0 m²
# points       : 0  points
# density      : 0 points/m²

plot(extent(las), add = T)
# Error in h(simpleError(msg, call)) : 
#   erreur d'évaluation de l'argument 'x' lors de la sélection d'une méthode pour la fonction 'plot' : invalid extent: xmin >= xmax
# De plus : Warning messages:
# 1: In min(x@data$X) : aucun argument trouvé pour min ; Inf est renvoyé
# 2: In max(x@data$X) : aucun argument pour max ; -Inf est renvoyé
# 3: In min(x@data$Y) : aucun argument trouvé pour min ; Inf est renvoyé
# 4: In max(x@data$Y) : aucun argument pour max ; -Inf est renvoyé

las = clip_rectangle(las_1, 157980 , 243480, 158270, 243770)
las
# class       : LAScatalog (v1.4 format 6)
# extent      : 157980, 158000, 243480, 243500 (xmin, xmax, ymin, ymax)
# coord. ref. : NAD83(2011) / Puerto Rico and Virgin Is. + PRVD02 height - Geoid12B (metre) (with axis order normalized for visualization) (with axis order normalized for visualization) 
# area        : 399.6001 m²
# points      : 30.6thousand points
# density     : 76.6 points/m²
# num. files  : 1 
Jean-Romain commented 3 years ago

That's really bad. readLAS(cl) returns 0 points which was expected but clip_rectangle() returns 30000 points in an area of 400 m² instead of 80 millions in an area of 84000 m² while I was expecting 0 because both are actually querying the same region the same way...

Please tell me your operating system and your version of R.

Jean-Romain commented 3 years ago

Also show me

plot(las_1)
points(lidR:::coordinates(las), cex = 0.2)

where las is the output of clip_rectangle() to see where those points actually are.

adrienmichez commented 3 years ago

I plotted (failed) the las object (i.e. the one which results from clip_rectangle() in your script).

I am on win10 with R 4.02

plot(las)
points(lidR:::coordinates(las), cex = 0.2)
# Error in x[, 2] : indice hors limites

R.version
# platform       x86_64-w64-mingw32          
# arch           x86_64                      
# os             mingw32                     
# system         x86_64, mingw32             
# status                                     
# major          4                           
# minor          0.2                         
# year           2020                        
# month          06                          
# day            22                          
# svn rev        78730                       
# language       R                           
# version.string R version 4.0.2 (2020-06-22)
# nickname       Taking Off Again   
Jean-Romain commented 3 years ago

not plot(las) but plot(las_1) (the catalog). Then points(lidR:::coordinates(las)) (the point cloud)

adrienmichez commented 3 years ago
library(lidR)
setwd('C:/TMP/lidR_bug/transfer_85074_files_52cf1058/')
las_1 <- readALSLAScatalog('./157668.5_243478.3.las')
las = clip_rectangle(las_1, 157980 , 243480, 158270, 243770)
plot(las_1)
points(lidR:::coordinates(las), cex = 0.2)

image

Jean-Romain commented 3 years ago

So I finally reproduced in a virtual box under windows. I closing this because it is not related to lidR. Moving to https://github.com/Jean-Romain/rlas/issues/50 . Please perform the test mentioned with LAStools

adrienmichez commented 3 years ago

I'm relief that you found a guilty!