gadomski / las-rs

Read and write ASPRS las files, Rust edition.
MIT License
73 stars 32 forks source link

Las1.1 support is failing, forcing the header to 1.2 seems to work? #61

Closed TerjeWiigMathisen closed 4 months ago

TerjeWiigMathisen commented 1 year ago

This should not be merged! It is just a proof of concept that when reading a las 1.1 file fails, forcing the version field to 1.2 does work. I have spent a couple of days trying to trace into the libraries to discover where the panic on 1.1 is happening but have not been able to do so, sorry!

gadomski commented 1 year ago

Thanks, I've converted to draft and will take a look when I get a chance. Appreciate it!

gadomski commented 1 year ago

One thing to check is that the point format that you're reading is supported by las 1.1, which only allows point formats 0 and 1. One quick way to check is with PDAL:

$ pdal info tests/data/autzen.las --metadata | jq .metadata.dataformat_id  
1

If you can provide an example file that fails, that could also help debugging.

TerjeWiigMathisen commented 1 year ago

I have been using LAStools for all LiDAR stuff for about 10 years, it gived the following info for one of the failing files Las 1.1, point format 1: C:\o\mapant>lasinfo -v h:\hoydedata\59\data\32-1-472-150-76.laz reading 'h:\hoydedata\59\data\32-1-472-150-76.laz' with 5658 points lasinfo (220107) report for 'h:\hoydedata\59\data\32-1-472-150-76.laz' reporting all LAS header entries: file signature: 'LASF' file source ID: 0 global_encoding: 1 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000 version major.minor: 1.1 system identifier: 'LAStools (c) by rapidlasso GmbH' generating software: 'lasclip (180303) commercial' file creation day/year: 42/2014 header size: 227 offset to point data: 528 number var. length records: 3 point data format: 1 point data record length: 28 number of point records: 5658 number of points by return: 3519 1682 410 47 0 scale factor x y z: 0.01 0.01 0.01 offset x y z: 0 0 0 min x y z: 326400.01 6724172.52 -0.37 max x y z: 327199.99 6724199.99 202.74 variable length header record 1 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34735 length after header 64 description 'by LAStools of rapidlasso GmbH' GeoKeyDirectoryTag version 1.1.0 number of keys 7 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected key 3072 tiff_tag_location 0 count 1 value_offset 25832 - ProjectedCSTypeGeoKey: ETRS89 / UTM 32N key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter key 3073 tiff_tag_location 34737 count 44 value_offset 0 - PCSCitationGeoKey: UTM sone 32, basert pσ EUREF89 (ETRS89/UTM) key 4096 tiff_tag_location 0 count 1 value_offset 5941 - VerticalCSTypeGeoKey: Norway Normal Null 2000 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter key 4097 tiff_tag_location 34737 count 7 value_offset 44 - VerticalCitationGeoKey: NN2000 variable length header record 2 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34736 length after header 24 description 'Double Param Array' GeoDoubleParamsTag (number of doubles 3) 6.37814e+006 6.35675e+006 298.257 variable length header record 3 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34737 length after header 51 description 'by LAStools of rapidlasso GmbH' GeoAsciiParamsTag (number of characters 51) UTM sone 32, basert p EUREF89 (ETRS89/UTM)|NN2000| LASzip compression (version 3.2r0 c2 50000): POINT10 2 GPSTIME11 2 reporting minimum and maximum for all LAS point record entries ... X 32640001 32719999 Y 672417252 672419999 Z -37 20274 intensity 0 2483 return_number 1 4 number_of_returns 1 4 edge_of_flight_line 0 0 scan_direction_flag 0 1 classification 1 9 scan_angle_rank -16 13 user_data 0 0 point_source_ID 9077 9081 gps_time 28967300.701887 28969704.919054 number of first returns: 3519 number of intermediate returns: 461 number of last returns: 3523 number of single returns: 1845 overview over number of returns of given pulse: 1845 2529 1095 189 0 0 0 histogram of classification of points: 3648 unclassified (1) 1461 ground (2) 30 noise (7) 519 water (9)

I'll upload a copy of this file to my server later tonight (Oslo time)

Terje PS. I also intend to see if I can extend the utm module to use the Danish formulas (developed for Greenland I think) which also works in northern Norway, far from the center line, but only when needed i.e. far north or far from the center line.

On Fri, Aug 25, 2023 at 2:46 PM Pete Gadomski @.***> wrote:

One thing to check is that the point format that you're reading is supported by las 1.1, which only allows point formats 0 and 1. One quick way to check is with PDAL https://pdal.io:

$ pdal info tests/data/autzen.las --metadata | jq .metadata.dataformat_id 1

If you can provide an example file that fails, that could also help debugging.

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1693303929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJUTVITNE7GBUETJU23XXCNDDANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

TerjeWiigMathisen commented 1 year ago

The file was so tiny that I'm just including it!

On Fri, Aug 25, 2023 at 4:46 PM Terje Mathisen @.***> wrote:

I have been using LAStools for all LiDAR stuff for about 10 years, it gived the following info for one of the failing files Las 1.1, point format 1: C:\o\mapant>lasinfo -v h:\hoydedata\59\data\32-1-472-150-76.laz reading 'h:\hoydedata\59\data\32-1-472-150-76.laz' with 5658 points lasinfo (220107) report for 'h:\hoydedata\59\data\32-1-472-150-76.laz' reporting all LAS header entries: file signature: 'LASF' file source ID: 0 global_encoding: 1 project ID GUID data 1-4: 00000000-0000-0000-0000-000000000000 version major.minor: 1.1 system identifier: 'LAStools (c) by rapidlasso GmbH' generating software: 'lasclip (180303) commercial' file creation day/year: 42/2014 header size: 227 offset to point data: 528 number var. length records: 3 point data format: 1 point data record length: 28 number of point records: 5658 number of points by return: 3519 1682 410 47 0 scale factor x y z: 0.01 0.01 0.01 offset x y z: 0 0 0 min x y z: 326400.01 6724172.52 -0.37 max x y z: 327199.99 6724199.99 202.74 variable length header record 1 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34735 length after header 64 description 'by LAStools of rapidlasso GmbH' GeoKeyDirectoryTag version 1.1.0 number of keys 7 key 1024 tiff_tag_location 0 count 1 value_offset 1 - GTModelTypeGeoKey: ModelTypeProjected key 3072 tiff_tag_location 0 count 1 value_offset 25832 - ProjectedCSTypeGeoKey: ETRS89 / UTM 32N key 3076 tiff_tag_location 0 count 1 value_offset 9001 - ProjLinearUnitsGeoKey: Linear_Meter key 3073 tiff_tag_location 34737 count 44 value_offset 0 - PCSCitationGeoKey: UTM sone 32, basert pσ EUREF89 (ETRS89/UTM) key 4096 tiff_tag_location 0 count 1 value_offset 5941 - VerticalCSTypeGeoKey: Norway Normal Null 2000 key 4099 tiff_tag_location 0 count 1 value_offset 9001 - VerticalUnitsGeoKey: Linear_Meter key 4097 tiff_tag_location 34737 count 7 value_offset 44 - VerticalCitationGeoKey: NN2000 variable length header record 2 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34736 length after header 24 description 'Double Param Array' GeoDoubleParamsTag (number of doubles 3) 6.37814e+006 6.35675e+006 298.257 variable length header record 3 of 3: reserved 43707 user ID 'LASF_Projection' record ID 34737 length after header 51 description 'by LAStools of rapidlasso GmbH' GeoAsciiParamsTag (number of characters 51) UTM sone 32, basert p EUREF89 (ETRS89/UTM)|NN2000| LASzip compression (version 3.2r0 c2 50000): POINT10 2 GPSTIME11 2 reporting minimum and maximum for all LAS point record entries ... X 32640001 32719999 Y 672417252 672419999 Z -37 20274 intensity 0 2483 return_number 1 4 number_of_returns 1 4 edge_of_flight_line 0 0 scan_direction_flag 0 1 classification 1 9 scan_angle_rank -16 13 user_data 0 0 point_source_ID 9077 9081 gps_time 28967300.701887 28969704.919054 number of first returns: 3519 number of intermediate returns: 461 number of last returns: 3523 number of single returns: 1845 overview over number of returns of given pulse: 1845 2529 1095 189 0 0 0 histogram of classification of points: 3648 unclassified (1) 1461 ground (2) 30 noise (7) 519 water (9)

I'll upload a copy of this file to my server later tonight (Oslo time)

Terje PS. I also intend to see if I can extend the utm module to use the Danish formulas (developed for Greenland I think) which also works in northern Norway, far from the center line, but only when needed i.e. far north or far from the center line.

On Fri, Aug 25, 2023 at 2:46 PM Pete Gadomski @.***> wrote:

One thing to check is that the point format that you're reading is supported by las 1.1, which only allows point formats 0 and 1. One quick way to check is with PDAL https://pdal.io:

$ pdal info tests/data/autzen.las --metadata | jq .metadata.dataformat_id 1

If you can provide an example file that fails, that could also help debugging.

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1693303929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJUTVITNE7GBUETJU23XXCNDDANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

gadomski commented 1 year ago

The file was so tiny that I'm just including it!

Hm, I don't see it attached anywhere, might have to provide a link or send some other way.

TerjeWiigMathisen commented 1 year ago

gmail claims that it was included in my outgoing mail, but you might have some security barrier somewhere?

Anyway, here it is: https://tmsw.no/o/32-1-472-150-76.laz

Regards, Terje

On Sat, Aug 26, 2023 at 4:11 PM Pete Gadomski @.***> wrote:

The file was so tiny that I'm just including it!

Hm, I don't see it attached anywhere, might have to provide a link or send some other way.

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1694349730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJUQOR5MPORA3TTJM5DXXH7YZANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

gadomski commented 1 year ago

Thanks for providing the file! I think it got stripped by Github (I don't think you can attach files via email to Github comments).

It appears as though the file you provided has a GPS Time Type field set to standard, which is not supported by LAS 1.1:

$ cargo run --features laz --example count ~/Downloads/32-1-472-150-76.laz 
   Compiling las v0.8.1 (/Users/gadomski/Code/gadomski/las-rs)
    Finished dev [unoptimized + debuginfo] target(s) in 1.78s
     Running `target/debug/examples/count /Users/gadomski/Downloads/32-1-472-150-76.laz`
thread 'main' panicked at 'Unable to open reader: Feature { version: Version { major: 1, minor: 1 }, feature: "GpsStandardTime" }', examples/count.rs:12:46
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Here's the relevant info from the las 1.2 standard:

GPS Absolute Time (as well as GPS Week Time) – LAS 1.0 and LAS 1.1 specified GPS “Week Time” only. This meant that GPS time stamps “rolled over” at midnight on Saturday. This makes processing of LIDAR flight lines that span the time reset difficult. LAS 1.2 allows both GPS Week Time and Absolute GPS Time (POSIX) stamps to be used.

So las-rs is behaving correctly, if overly strictly and with poor error messages. I've opened two issues to track those problems:

Note that las-rs is unsupported+unfunded, so I can't really give a timeline when these things will be fixed -- it'll have to happen on my own time. In the meantime, your workaround of upgrading the version seems reasonable.

As an aside, you mentioned that you're using lastools -- I would recommend checking out PDAL, as it is actively maintained and comes with a lot of batteries included. It reads permissively and writes strictly, so you could use it to "fix" your files, e.g.:

pdal translate 32-1-472-150-76.laz 32-1-472-150-76.fixed.laz

Hope that helps!

TerjeWiigMathisen commented 1 year ago

Thanks a lot for responding, I did look for anything like that since I did get the helpful error message but I was unable to locate exactly where the sanity checking happens. :-(

BTW, feel free to use my personal gmail if you have anything which github won't like!

I have found another instance where it seems like las-rs is more intolerant than needed, i.e. the opposite of pdal (or LAStools): Norwegian national LiDAR coverage contains a lot of different stuff, in a 10 year old project I found a file which panics due to the contents in generating_software: The LAS specs document this field as 16 bytes in the text even though it is of course 32 as shown in the tables. It also states that any bytes past the end of the string (up to 16) must be filled with nulls, and las-rs is seems to be checking this, so it panics on a file where the string is 30 bytes long, followed by just a single null and some random garbage in the 32'nd byte.

No other LiDAR software I have seen cares about what happens to lie after the first null!

Do you want a copy of this one as well?

Regards, Terje

On Mon, Aug 28, 2023 at 2:20 PM Pete Gadomski @.***> wrote:

Thanks for providing the file! I think it got stripped by Github (I don't think you can attach files via email to Github comments).

It appears as though the file you provided has a GPS Time Type field set to standard, which is not supported by LAS 1.1:

$ cargo run --features laz --example count ~/Downloads/32-1-472-150-76.laz Compiling las v0.8.1 (/Users/gadomski/Code/gadomski/las-rs) Finished dev [unoptimized + debuginfo] target(s) in 1.78s Running target/debug/examples/count /Users/gadomski/Downloads/32-1-472-150-76.laz thread 'main' panicked at 'Unable to open reader: Feature { version: Version { major: 1, minor: 1 }, feature: "GpsStandardTime" }', examples/count.rs:12:46 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Here's the relevant info from the las 1.2 standard https://www.asprs.org/a/society/committees/standards/asprs_las_format_v12.pdf :

GPS Absolute Time (as well as GPS Week Time) – LAS 1.0 and LAS 1.1 specified GPS “Week Time” only. This meant that GPS time stamps “rolled over” at midnight on Saturday. This makes processing of LIDAR flight lines that span the time reset difficult. LAS 1.2 allows both GPS Week Time and Absolute GPS Time (POSIX) stamps to be used.

So las-rs is behaving correctly, if overly strictly and with poor error messages. I've opened two issues to track those problems:

Note that las-rs is unsupported+unfunded, so I can't really give a timeline when these things will be fixed -- it'll have to happen on my own time. In the meantime, your workaround of upgrading the version seems reasonable.

As an aside, you mentioned that you're using lastools -- I would recommend checking out PDAL https://pdal.io, as it is actively maintained and comes with a lot of batteries included. It reads permissively and writes strictly, so you could use it to "fix" your files, e.g.:

pdal translate 32-1-472-150-76.laz 32-1-472-150-76.fixed.laz

Hope that helps!

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1695598618, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJSAXHA6GDJSLYQTPFDXXSEIJANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

gadomski commented 1 year ago

No other LiDAR software I have seen cares about what happens to lie after the first null!

Yup, that's probably true ... I wrote las-rs to be strict to the spec, and I didn't really have a real-world user in mind :-). But since folks are using it, it would make sense to update the library to read more permissively.

Do you want a copy of this one as well?

Yes please!

TerjeWiigMathisen commented 1 year ago

https://tmsw.no/o/33-1-501-284-40.laz

On Mon, Aug 28, 2023 at 9:56 PM Pete Gadomski @.***> wrote:

No other LiDAR software I have seen cares about what happens to lie after the first null!

Yup, that's probably true ... I wrote las-rs to be strict to the spec, and I didn't really have a real-world user in mind :-). But since folks are using it, it would make sense to update the library to read more permissively.

Do you want a copy of this one as well?

Yes please!

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1696312877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJSWL7UPEF6VQM77D3LXXTZXBANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

TerjeWiigMathisen commented 1 year ago

BTW, are you willing to point me to where in the source the validation happens? Knowing where to start I'm confident I could fix these so that meaningful data is let through, even with warnings. :-)

On Mon, Aug 28, 2023 at 10:04 PM Terje Mathisen @.***> wrote:

https://tmsw.no/o/33-1-501-284-40.laz

On Mon, Aug 28, 2023 at 9:56 PM Pete Gadomski @.***> wrote:

No other LiDAR software I have seen cares about what happens to lie after the first null!

Yup, that's probably true ... I wrote las-rs to be strict to the spec, and I didn't really have a real-world user in mind :-). But since folks are using it, it would make sense to update the library to read more permissively.

Do you want a copy of this one as well?

Yes please!

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1696312877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJSWL7UPEF6VQM77D3LXXTZXBANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

gadomski commented 1 year ago

The mapping from each version to the features it supports is here: https://github.com/gadomski/las-rs/blob/bea51e5a28b2f15e1ffe8503d94c591d827deace/src/feature.rs#L75-L88

The checks happen in Builder::into_header: https://github.com/gadomski/las-rs/blob/bea51e5a28b2f15e1ffe8503d94c591d827deace/src/header/builder.rs#L176-L198

If we were going to make things permissive, we might have a boolean attribute on Builder called upgrade_version_to_support_header or something (that's a terrible name, I haven't thought much about this). Then, we would replace those self.version.verify_support_for calls with a new self.verify_support_for_or_upgrade, which would check the upgrade boolean and either return an Err or upgrade the version, w/ a warning. There might be some strange edge cases to handle, but that's how I picture things going more-or-less.

TerjeWiigMathisen commented 1 year ago

Thanks, I¨ll take a look tomorrow (it is 23:06 here in Oslo)!

On Mon, Aug 28, 2023 at 10:59 PM Pete Gadomski @.***> wrote:

The mapping from each version to the features it supports is here: https://github.com/gadomski/las-rs/blob/bea51e5a28b2f15e1ffe8503d94c591d827deace/src/feature.rs#L75-L88

The checks happen in Builder::into_header:

https://github.com/gadomski/las-rs/blob/bea51e5a28b2f15e1ffe8503d94c591d827deace/src/header/builder.rs#L176-L198

If we were going to make things permissive, we might have a boolean attribute on Builder called upgrade_version_to_support_header or something (that's a terrible name, I haven't thought much about this). Then, we would replace those self.version.verify_support_for calls with a new self.verify_support_for_or_upgrade, which would check the upgrade boolean and either return an Err or upgrade the version, w/ a warning. There might be some strange edge cases to handle, but that's how I picture things going more-or-less.

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-1696403108, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJRGBSXFLJBS6AF7CNTXXUBBVANCNFSM6AAAAAA36LE2UI . You are receiving this because you authored the thread.Message ID: @.***>

gadomski commented 4 months ago

@TerjeWiigMathisen I attempted to reproduce the "bytes after null" error for https://tmsw.no/o/33-1-501-284-40.laz, and couldn't:

$ cargo run -F laz --example count ~/Downloads/33-1-501-284-40.laz 
Number of points: 962652

Can you provide a reproducible example of the error? Thanks!

TerjeWiigMathisen commented 4 months ago

I'll try to reproduce it this weekend, no available time today.

BTW, have you noticed that PROJ has been ported to Rust?

proj4rs seems like a directport/translation of the original C:

In the transverse mercator code they use the

On Thu, May 30, 2024 at 6:54 PM Pete Gadomski @.***> wrote:

@TerjeWiigMathisen https://github.com/TerjeWiigMathisen I attempted to reproduce the "bytes after null" error for https://tmsw.no/o/33-1-501-284-40.laz, and couldn't:

$ cargo run -F laz --example count ~/Downloads/33-1-501-284-40.laz Number of points: 962652

Can you provide a reproducible example of the error? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-2140250966, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJXJDSVMHQXCTSV7NGDZE5KS3AVCNFSM6AAAAAA36LE2UKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBQGI2TAOJWGY . You are receiving this because you were mentioned.Message ID: @.***>

TerjeWiigMathisen commented 4 months ago

Sorry, hit send too son!

The code is at https://github.com/3liz/proj4rs and the transverse mercator code (https://github.com/3liz/proj4rs/blob/main/src/projections/tmerc.rs) uses the Poder/Ensager algorithm which I believe was originally developed to allow Denmark to map their Greenland territory. It works very well, giving positions within a mm (compared to the official Norwegian Mapping Authority) for locations in the for north-eastern corner of Norway.

On Fri, May 31, 2024 at 8:55 AM Terje Mathisen @.***> wrote:

I'll try to reproduce it this weekend, no available time today.

BTW, have you noticed that PROJ has been ported to Rust?

proj4rs seems like a directport/translation of the original C:

In the transverse mercator code they use the

On Thu, May 30, 2024 at 6:54 PM Pete Gadomski @.***> wrote:

@TerjeWiigMathisen https://github.com/TerjeWiigMathisen I attempted to reproduce the "bytes after null" error for https://tmsw.no/o/33-1-501-284-40.laz, and couldn't:

$ cargo run -F laz --example count ~/Downloads/33-1-501-284-40.laz Number of points: 962652

Can you provide a reproducible example of the error? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/gadomski/las-rs/pull/61#issuecomment-2140250966, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAOAJXJDSVMHQXCTSV7NGDZE5KS3AVCNFSM6AAAAAA36LE2UKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBQGI2TAOJWGY . You are receiving this because you were mentioned.Message ID: @.***>