r-lidar / rlas

R package to read and write las and laz files used to store LiDAR data
GNU General Public License v3.0
34 stars 14 forks source link

Partially read corrupted file without fail #66

Closed jfbourdon closed 7 months ago

jfbourdon commented 7 months ago

Reading a corrupted file using rlas::read.las() doesn't result in a failed operation (a variable is created with some data) even though an error message is displayed (but not like with stop(), more like with message("ERROR: ...") ). I'm under the impression that the message that I get is simply passed from LASlib as I can see the same exact message if I use lasinfo from LAStools to extract information on the same file. Running lidR::las_check() on the loaded las object doesn't raise any flag even if the file isn't fully read.

Message displayed in R:

ERROR: 'end-of-file during chunk with index 21' after 1071430 of 7113804 points for 'C:\Temp\20_3625411f06_dc.laz'

Message displayed with lasinfo from LAStools:

LASzip compression (version 3.4r1 c2 50000): POINT10 2 GPSTIME11 2
reporting minimum and maximum for all LAS point record entries ...
ERROR: 'end-of-file during chunk with index 21' after 1071430 of 7113804 points for 'C:\Temp\20_3625411f06_dc.laz'

Unfortunately, I can't provide my corrupted file as my OS refuse to read it completely on the disk in order to upload or copy it... I might have a bad sector on my disk where the file is located. However, if the message I see in R really come from LASlib, triggering a stop() if the string ERROR: is found would do the trick. If necessary, adding a parameter like "force=FALSE/TRUE" to rlas::read.las() would give the option to force read the file without failure like now.

Final side note on the impact of this in lidR. If I created a LAScatolog with the corrupted file and try to clip a region with lidR:clip_roi() for exemple, I get the same error message multiple times and then R crashes completely.

Jean-Romain commented 7 months ago

You are completely right.

LASlib contains hundreds of lines of codes like fprintf(stderr, "....") and warning and error are undifferentiated. In R it is not accepted to use fprintf and stdout/stderr, we must use Rprintf and REprintf to print error. However, to throw and errors, we must use Rf_error and Rf_warning. What I did to make LASlib R compliant is that I automatically replaced everything with REprintf and it is impossible to scan every occurrence to see if it is a real warning or a real error. And even if I can spot errors, I cannot replace easily by Rf_error() in place automatically, because Rf_error exits the code while the code in LASlib usually returns false in order to catch the error and release memory.

Long story short, if I catch errors with stop() equivalent, I will create a memory leak. In order to catch it properly, I must treat manually every single case one by one and change the code in depth.

Final side note on the impact of this in lidR. If I created a LAScatolog with the corrupted file and try to clip a region with lidR:clip_roi() for exemple, I get the same error message multiple times and then R crashes completely.

That's a serious problem. Generating a corrupted file should not be that hard. I'll give it a try.

jfbourdon commented 7 months ago

Of course it couldn't be an easy fix... I'll manage around that then. Thanks for the explanation.

Jean-Romain commented 4 months ago

i guess this fixed the issue https://github.com/LAStools/LAStools/pull/175/

Jean-Romain commented 3 months ago

This commit may fix this issue