connormanning / entwine

Entwine - point cloud organization for massive datasets
https://entwine.io
Other
451 stars 128 forks source link

entwine and untwine failing for a small dataset #321

Closed kjwaters closed 9 months ago

kjwaters commented 9 months ago

I'm having issues with entwine and untwine on a particular small data set. The files (in geographic) are at https://noaa-nos-coastal-lidar-pds.s3.amazonaws.com/laz/geoid12b/10072/index.html or as a STAC catalog at https://noaa-nos-coastal-lidar-pds.s3.amazonaws.com/laz/geoid12b/10072/stac/catalog.json. I'm looking for any suggestions on what might be the issue.

When I try to build with entwine after reprojecting to UTM zone 3, everything seems to work as it should, except the 0-0-0-0.laz tile has zero size. That causes an error when trying to pull the data with pdal or potree.

When I try to build with untwine, providing an output srs of EPSG:6332, I get a segmentation fault. The command and output look like: untwine --files=input_dir --output_dir=/san1/dem1_z/untwine/10072 --a_srs=EPSG:6332 --single_file=true Exception: Failure writing to '/san1/dem1_z/untwine/10072'. Exception: Failure writing to '/san1/dem1_z/untwine/10072'. Exception: Failure writing to '/san1/dem1_z/untwine/10072'. Segmentation fault (core dumped)

untwine version is 1.1.0 The disk for the output has over 8Tb free and this is a small dataset. The output directory is owned and writable by the user running untwine. As far as I can tell, the input files have the expected bounds (no weird points at 0,0 and such).

connormanning commented 9 months ago

This is because of the geographic coordinate system being reprojected to a projected one, but having the EPT metadata come from a shallow scan of the geographic data and having the bounds determined by reprojecting the corners - which doesn't work properly between geographic/projected. Conceptually the issue is similar to the one fixed in pdal/pdal#4207. However Entwine doesn't have this sort of special logic. What ends up happening in Entwine is that all the points end up being discarded as out-of-bounds because the computed metadata is no good to start with.

To fix it, you can use the --deep flag to do a full per-point read of the data rather than a shallow header-only scan. This will take longer since all the data needs to be read instead of just the LAS header. But I was able to successfully view your data with:

entwine build -i <input.laz> -o <output-dir> -r EPSG:6332 --deep

I have nothing to say regarding Untwine's behavior, you could submit a ticket there if needed. I would definitely suspect that the issue is an assumption that the coordinate system is non-geographic, but I think Untwine's position would be that you should reproject your data to something non-grographic ahead of time.