r-lidar / lasR

Fast and Pipable Airborne LiDAR Data Tools
https://r-lidar.github.io/lasR/
GNU General Public License v3.0
43 stars 0 forks source link

processing times multiply if using `write_vpc()` with `set_crs()` #51

Closed wiesehahn closed 2 weeks ago

wiesehahn commented 2 weeks ago

When assigning a CRS via set_crs() and then write_vpc() I get quite long computation times although the documentation states that

This stage does not reproject the data. It assigns a CRS.

And hence I expected pipelines to be more or less equally fast with or without this stage.

(see initial mention in https://github.com/r-lidar/lasR/discussions/32)

I tested this on multiple files and it occurs on data of multiple different campaigns (although not with testdata).

Processing time seems to be dependent on file size, is this expected (I guess just the header is read and thus I would not expect big differences)?

It does not occur on testdata

f <- system.file("extdata", "MixedConifer.las", package = "lasR")
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System:
#>   User input: EPSG:26912 
#>   wkt:
#> PROJCRS["NAD83 / UTM zone 12N",
#>     BASEGEOGCRS["NAD83",
#>         DATUM["North American Datum 1983",
#>             ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4269]],
#>     CONVERSION["UTM zone 12N",
#>         METHOD["Transverse Mercator",
#>             ID["EPSG",9807]],
#>         PARAMETER["Latitude of natural origin",0,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8801]],
#>         PARAMETER["Longitude of natural origin",-111,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8802]],
#>         PARAMETER["Scale factor at natural origin",0.9996,
#>             SCALEUNIT["unity",1],
#>             ID["EPSG",8805]],
#>         PARAMETER["False easting",500000,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8806]],
#>         PARAMETER["False northing",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8807]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1]],
#>     USAGE[
#>         SCOPE["Engineering survey, topographic mapping."],
#>         AREA["North America - between 114°W and 108°W - onshore and offshore. Canada - Alberta; Northwest Territories; Nunavut; Saskatchewan. United States (USA) - Arizona; Colorado; Idaho; Montana; New Mexico; Utah; Wyoming."],
#>         BBOX[31.33,-114,84,-108]],
#>     ID["EPSG",26912]]
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 0.09346604 secs
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 0.110028 secs

E.g. for a file with no CRS assigned (136MB)

f <- "/lasfilez_557000_5754000_laz.laz"
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System: NA
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Error: in 'write_vpc' while processing the catalog: Invalid CRS. Cannot write a VPC file
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 25.08738 secs

another file with no CRS assigned (774MB)

f <- "/1km_592_5942.laz"
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System: NA
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Error: in 'write_vpc' while processing the catalog: Invalid CRS. Cannot write a VPC file
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 2.150083 mins

and for the same file when setting CRS before and writing to file

f <- "/1km_592_5942.laz"
las <- lidR::readLAS(f)
lidR::st_crs(las)
#> Coordinate Reference System: NA
lidR::st_crs(las) <- 25832
f <- lidR::writeLAS(las, tempfile(fileext = ".laz"))
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System:
#>   User input: EPSG:25832 
#>   wkt:
#> PROJCRS["ETRS89 / UTM zone 32N",
#>     BASEGEOGCRS["ETRS89",
#>         ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
#>             MEMBER["European Terrestrial Reference Frame 1989"],
#>             MEMBER["European Terrestrial Reference Frame 1990"],
#>             MEMBER["European Terrestrial Reference Frame 1991"],
#>             MEMBER["European Terrestrial Reference Frame 1992"],
#>             MEMBER["European Terrestrial Reference Frame 1993"],
#>             MEMBER["European Terrestrial Reference Frame 1994"],
#>             MEMBER["European Terrestrial Reference Frame 1996"],
#>             MEMBER["European Terrestrial Reference Frame 1997"],
#>             MEMBER["European Terrestrial Reference Frame 2000"],
#>             MEMBER["European Terrestrial Reference Frame 2005"],
#>             MEMBER["European Terrestrial Reference Frame 2014"],
#>             ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                 LENGTHUNIT["metre",1]],
#>             ENSEMBLEACCURACY[0.1]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4258]],
#>     CONVERSION["UTM zone 32N",
#>         METHOD["Transverse Mercator",
#>             ID["EPSG",9807]],
#>         PARAMETER["Latitude of natural origin",0,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8801]],
#>         PARAMETER["Longitude of natural origin",9,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8802]],
#>         PARAMETER["Scale factor at natural origin",0.9996,
#>             SCALEUNIT["unity",1],
#>             ID["EPSG",8805]],
#>         PARAMETER["False easting",500000,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8806]],
#>         PARAMETER["False northing",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8807]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1]],
#>     USAGE[
#>         SCOPE["Engineering survey, topographic mapping."],
#>         AREA["Europe between 6°E and 12°E: Austria; Belgium; Denmark - onshore and offshore; Germany - onshore and offshore; Norway including - onshore and offshore; Spain - offshore."],
#>         BBOX[38.76,6,84.33,12.01]],
#>     ID["EPSG",25832]]
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 0.1311841 secs
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 1.221565 mins

file with CRS assigned (410MB)

f <- "/2024_01.1_solling_32_547_5728.laz"
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System:
#>   User input: PROJCRS["ETRS89 / UTM zone 32N",BASEGEOGCRS["ETRS89",ENSEMBLE["European Terrestrial Reference System 1989 ensemble", MEMBER["European Terrestrial Reference Frame 1989", ID["EPSG",1178]], MEMBER["European Terrestrial Reference Frame 1990", ID["EPSG",1179]], MEMBER["European Terrestrial Reference Frame 1991", ID["EPSG",1180]], MEMBER["European Terrestrial Reference Frame 1992", ID["EPSG",1181]], MEMBER["European Terrestrial Reference Frame 1993", ID["EPSG",1182]], MEMBER["European Terrestrial Reference Frame 1994", ID["EPSG",1183]], MEMBER["European Terrestrial Reference Frame 1996", ID["EPSG",1184]], MEMBER["European Terrestrial Reference Frame 1997", ID["EPSG",1185]], MEMBER["European Terrestrial Reference Frame 2000", ID["EPSG",1186]], MEMBER["European Terrestrial Reference Frame 2005", ID["EPSG",1204]], MEMBER["European Terrestrial Reference Frame 2014", ID["EPSG",1206]], ELLIPSOID["GRS 1980",6378137,298.2572221,LENGTHUNIT["metre",1,ID["EPSG",9001]],ID["EPSG",7019]], ENSEMBLEACCURACY[0.1],ID["EPSG",6258]],ID["EPSG",4258]],CONVERSION["UTM zone 32N",METHOD["Transverse Mercator",ID["EPSG",9807]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433,ID["EPSG",9102]]],PARAMETER["Longitude of natural origin",9,ANGLEUNIT["degree",0.0174532925199433,ID["EPSG",9102]]],PARAMETER["Scale factor at natural origin",0.9996,SCALEUNIT["unity",1,ID["EPSG",9201]]],PARAMETER["False easting",500000,LENGTHUNIT["metre",1,ID["EPSG",9001]]],PARAMETER["False northing",0,LENGTHUNIT["metre",1,ID["EPSG",9001]]],ID["EPSG",16032]],CS[Cartesian,2,ID["EPSG",4400]],AXIS["Easting (E)",east],AXIS["Northing (N)",north],LENGTHUNIT["metre",1,ID["EPSG",9001]],ID["EPSG",25832]] 
#>   wkt:
#> PROJCRS["ETRS89 / UTM zone 32N",
#>     BASEGEOGCRS["ETRS89",
#>         ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
#>             MEMBER["European Terrestrial Reference Frame 1989"],
#>             MEMBER["European Terrestrial Reference Frame 1990"],
#>             MEMBER["European Terrestrial Reference Frame 1991"],
#>             MEMBER["European Terrestrial Reference Frame 1992"],
#>             MEMBER["European Terrestrial Reference Frame 1993"],
#>             MEMBER["European Terrestrial Reference Frame 1994"],
#>             MEMBER["European Terrestrial Reference Frame 1996"],
#>             MEMBER["European Terrestrial Reference Frame 1997"],
#>             MEMBER["European Terrestrial Reference Frame 2000"],
#>             MEMBER["European Terrestrial Reference Frame 2005"],
#>             MEMBER["European Terrestrial Reference Frame 2014"],
#>             ELLIPSOID["GRS 1980",6378137,298.2572221,
#>                 LENGTHUNIT["metre",1]],
#>             ENSEMBLEACCURACY[0.1]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4258]],
#>     CONVERSION["UTM zone 32N",
#>         METHOD["Transverse Mercator",
#>             ID["EPSG",9807]],
#>         PARAMETER["Latitude of natural origin",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         PARAMETER["Longitude of natural origin",9,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         PARAMETER["Scale factor at natural origin",0.9996,
#>             SCALEUNIT["unity",1]],
#>         PARAMETER["False easting",500000,
#>             LENGTHUNIT["metre",1]],
#>         PARAMETER["False northing",0,
#>             LENGTHUNIT["metre",1]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1]],
#>     ID["EPSG",25832]]
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 0.101542 secs
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 1.195718 mins

file with CRS assigned (575MB)

f <- "/03_laz_ground_noise/2023_06_solling_32_533_5727.laz"
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System:
#>   User input: COMPD_CS["ETRS89 / ETRS89 / UTM 32N + DHHN2016",PROJCS["ETRS89 / ETRS89 / UTM 32N",GEOGCS["ETRS89",DATUM["European_Terrestrial_Reference_System_1989",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6258"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4258"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","25832"]],VERT_CS["DHHN2016",VERT_DATUM["Deutsches Haupthoehennetz 2016",2005,AUTHORITY["EPSG","7837"]],UNIT["metre",1.0,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP],AUTHORITY["EPSG","7837"]]] 
#>   wkt:
#> COMPOUNDCRS["ETRS89 / ETRS89 / UTM 32N + DHHN2016",
#>     PROJCRS["ETRS89 / ETRS89 / UTM 32N",
#>         BASEGEOGCRS["ETRS89",
#>             DATUM["European Terrestrial Reference System 1989",
#>                 ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                     LENGTHUNIT["metre",1]]],
#>             PRIMEM["Greenwich",0,
#>                 ANGLEUNIT["degree",0.0174532925199433]],
#>             ID["EPSG",4258]],
#>         CONVERSION["UTM zone 32N",
#>             METHOD["Transverse Mercator",
#>                 ID["EPSG",9807]],
#>             PARAMETER["Latitude of natural origin",0,
#>                 ANGLEUNIT["degree",0.0174532925199433],
#>                 ID["EPSG",8801]],
#>             PARAMETER["Longitude of natural origin",9,
#>                 ANGLEUNIT["degree",0.0174532925199433],
#>                 ID["EPSG",8802]],
#>             PARAMETER["Scale factor at natural origin",0.9996,
#>                 SCALEUNIT["unity",1],
#>                 ID["EPSG",8805]],
#>             PARAMETER["False easting",500000,
#>                 LENGTHUNIT["metre",1],
#>                 ID["EPSG",8806]],
#>             PARAMETER["False northing",0,
#>                 LENGTHUNIT["metre",1],
#>                 ID["EPSG",8807]]],
#>         CS[Cartesian,2],
#>             AXIS["easting",east,
#>                 ORDER[1],
#>                 LENGTHUNIT["metre",1]],
#>             AXIS["northing",north,
#>                 ORDER[2],
#>                 LENGTHUNIT["metre",1]],
#>         ID["EPSG",25832]],
#>     VERTCRS["DHHN2016",
#>         VDATUM["Deutsches Haupthoehennetz 2016"],
#>         CS[vertical,1],
#>             AXIS["gravity-related height",up,
#>                 LENGTHUNIT["metre",1]],
#>         ID["EPSG",7837]]]
ans <- benchmark(lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 0.1659181 secs
ans <- benchmark(lasR::set_crs(25832) + lasR::write_vpc(tempfile(fileext = ".vpc")), data = f)
#> Time difference of 1.675962 mins

same file as above but with summarise instead of write_vpc in the pipeline the time difference is smaller (its faster with set_crs)

f <- "/03_laz_ground_noise/2023_06_solling_32_533_5727.laz"
fileheader <- lidR::readLASheader(f)
lidR::st_crs(fileheader)
#> Coordinate Reference System:
#>   User input: COMPD_CS["ETRS89 / ETRS89 / UTM 32N + DHHN2016",PROJCS["ETRS89 / ETRS89 / UTM 32N",GEOGCS["ETRS89",DATUM["European_Terrestrial_Reference_System_1989",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6258"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4258"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",9],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","25832"]],VERT_CS["DHHN2016",VERT_DATUM["Deutsches Haupthoehennetz 2016",2005,AUTHORITY["EPSG","7837"]],UNIT["metre",1.0,AUTHORITY["EPSG","9001"]],AXIS["Gravity-related height",UP],AUTHORITY["EPSG","7837"]]] 
#>   wkt:
#> COMPOUNDCRS["ETRS89 / ETRS89 / UTM 32N + DHHN2016",
#>     PROJCRS["ETRS89 / ETRS89 / UTM 32N",
#>         BASEGEOGCRS["ETRS89",
#>             DATUM["European Terrestrial Reference System 1989",
#>                 ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                     LENGTHUNIT["metre",1]]],
#>             PRIMEM["Greenwich",0,
#>                 ANGLEUNIT["degree",0.0174532925199433]],
#>             ID["EPSG",4258]],
#>         CONVERSION["UTM zone 32N",
#>             METHOD["Transverse Mercator",
#>                 ID["EPSG",9807]],
#>             PARAMETER["Latitude of natural origin",0,
#>                 ANGLEUNIT["degree",0.0174532925199433],
#>                 ID["EPSG",8801]],
#>             PARAMETER["Longitude of natural origin",9,
#>                 ANGLEUNIT["degree",0.0174532925199433],
#>                 ID["EPSG",8802]],
#>             PARAMETER["Scale factor at natural origin",0.9996,
#>                 SCALEUNIT["unity",1],
#>                 ID["EPSG",8805]],
#>             PARAMETER["False easting",500000,
#>                 LENGTHUNIT["metre",1],
#>                 ID["EPSG",8806]],
#>             PARAMETER["False northing",0,
#>                 LENGTHUNIT["metre",1],
#>                 ID["EPSG",8807]]],
#>         CS[Cartesian,2],
#>             AXIS["easting",east,
#>                 ORDER[1],
#>                 LENGTHUNIT["metre",1]],
#>             AXIS["northing",north,
#>                 ORDER[2],
#>                 LENGTHUNIT["metre",1]],
#>         ID["EPSG",25832]],
#>     VERTCRS["DHHN2016",
#>         VDATUM["Deutsches Haupthoehennetz 2016"],
#>         CS[vertical,1],
#>             AXIS["gravity-related height",up,
#>                 LENGTHUNIT["metre",1]],
#>         ID["EPSG",7837]]]
ans <- benchmark(lasR::summarise(), data = f)
#> Time difference of 1.434613 mins
ans <- benchmark(lasR::set_crs(25832) + lasR::summarise(), data = f)
#> Time difference of 1.14472 mins
Jean-Romain commented 2 weeks ago

Fixed.

Some explainations: your pipeline is reader_las() + write_vpc() (even if you omitted reader_las()). reader_las() always reads the header and reads the point cloud if necessary. If necessary is determined by the other stages and write_vpc() is labelled internally such as the point cloud is not necessary so the file is not read in this pipleine. However set_crs() was incorrectly flagged and consequently reader_las() read the point cloud in the pipeline reader_las() + set_crs() + write_vpc() because set_crs() stated that the points were required