fMoW / dataset

Other
205 stars 24 forks source link

Corrupted Images #19

Open Erotemic opened 2 months ago

Erotemic commented 2 months ago

I've found two corrupted images in the dataset on AWS. The following MWE demonstrates the issues

aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/train/helipad/helipad_373/helipad_373_3_rgb.tif train-bad-helipad.tif

aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/val/helipad/helipad_107/helipad_107_3_rgb.tif val-bad-helipad.tif

gdalinfo val-bad-helipad.tif
gdalinfo train-bad-helipad.tif

Results in:

gdalinfo failed - unable to open 'val-bad-helipad.tif'. gdalinfo failed - unable to open 'train-bad-helipad.tif'.

More detailed mwe:

python -c "from osgeo import gdal; gdal.UseExceptions(); gdal.Open('val-bad-helipad.tif')"
python -c "from osgeo import gdal; gdal.UseExceptions(); gdal.Open('train-bad-helipad.tif')"

Results in:

RuntimeError: val-bad-helipad.tif: MissingRequired:TIFF directory is missing required "StripOffsets" field RuntimeError: train-bad-helipad.tif: MissingRequired:TIFF directory is missing required "StripOffsets" field