hobuinc / usgs-lidar

AWS Entwine Point Tiles USGS LiDAR Public Dataset GitHub repo
https://registry.opendata.aws/usgs-lidar/
137 stars 14 forks source link

Duplicate dataset: USGS_LPC_MN_Phase1_FairbaultCO_2010_LAS_2016 #70

Open mattbeckley opened 6 months ago

mattbeckley commented 6 months ago

There are two versions of this dataset: USGS_LPC_MN_Phase1_FairbaultCO_2010_LAS_2016 and USGS_LPC_MN_Phase1_FairbaultCo_2010_LAS_2016

Note, the only difference is the "CO" vs "Co" in the filename. I differenced the top-level entwine directories, and the ept-data subdirectories, and the only difference seems to be a slight difference in the timestamp on laz files. I spot-checked a couple of the laz files and they were identical.

keythread commented 6 months ago

Yes, this is a duplicate which appears to have been present since the initial EPT copy. Both WorkUnits were copied 12/30/2018. I will be recommending removal of USGS_LPC_MN_Phase1_FairbaultCo_2010_LAS_2016 from the public dataset while retaining USGS_LPC_MN_Phase1_FairbaultCO_2010_LAS_2016 because the latter name better matches the current name in the WESM that uses the upper-case 'CO'. Other duplicates exist as well which will also need to be deleted in the future to save cost and avoid confusion.

hobu commented 6 months ago

I will be recommending removal of USGS_LPC_MN_Phase1_FairbaultCo_2010_LAS_2016 from the public dataset

~I will queue this up tonight~.

keythread commented 6 months ago

You may want to wait. NOAA expressed some concerns that they may need time to adjust their indexes before we do actual deletes.