tilezen / joerd

Joerd can be used to download, merge and generate tiles from digital elevation data
MIT License
323 stars 49 forks source link

Bad data in 14,2519,5484 #217

Open vpicaver opened 8 months ago

vpicaver commented 8 months ago

https://s3.amazonaws.com/elevation-tiles-prod/geotiff/14/2519/5484.tif

There's two rows of bad data in this tile, shown in black from QGIS: image

Here's the output from gdalinfo: Band 1 Block=256x256 Type=Float32, ColorInterp=Gray Minimum=-8566.333, Maximum=2539.531, Mean=2173.470, StdDev=497.209 NoData Value=-3.4028235e+38 Metadata: STATISTICS_MAXIMUM=2539.5305175781 STATISTICS_MEAN=2173.4702906613 STATISTICS_MINIMUM=-8566.3330078125 STATISTICS_STDDEV=497.20867381496 STATISTICS_VALID_PERCENT=100

This also exists in neighboring tiles. This shouldn't have -8566. It should have nodata or the correct values.

nvkelso commented 8 months ago

There was a GDAL config error related to NODATA that manifests where two or more sources touch in this old build. To work around it you'd need to do light pre-processing on the tile data to scrub non-sensical data like this.

Ideally there would be another global build to fix this, but it needs resourcing of engineering time and compute.

vpicaver commented 8 months ago

Bad data also exists in 13 1258 2742, related to the row of data. In 13, 2358, 2742 the elevation data is -6872m.

In 14, 2519, 5484 there's two rows of data that are invalid with data -8566.33m and 82.75m.

What would you suggest on doing reliable "light pre-processing"? Currently, I'm using z-score to through out outliers. Do you any a better method?