OpenDataAnalytics / gaia

Gaia is a geospatial analysis library jointly developed by Kitware and Epidemico.
31 stars 15 forks source link

Dead kernels reading certain raster files #85

Open geordgez opened 7 years ago

geordgez commented 7 years ago

Note that this could be an issue with my local machine/environment since the tests using the same datasets appear to be fine.

In gaia.geo.geo_inputs, when using the read function of a RasterFileIO object, large raster files kill my iPython Notebook kernel e.g., globalprecip below (~300kb):

globalprecip = RasterFileIO(uri='../../tests/data/globalprecip.tif')
gp_ar = mb_nodata_small.read() # running this line kills the kernel

I don't run into any problems with '../../tests/data/globalairtemp.tif' (~100kb) and smaller files.

geordgez commented 7 years ago

Upon further investigation, the dead kernel occurs along with a warning message in the terminal:

TIFFReadDirectory: Warning, Unknown field with tag 42113 (0xa481) encountered.

The description for the 42113 TIFF private tag at Aware [Systems] Tag reference corresponds to the following error:

42113 A481 GDAL_NODATA Used by the GDAL library, contains an ASCII encoded nodata or background pixel value.

Full description:

Used by the GDAL library, contains an ASCII encoded nodata or background pixel value.
In the geospatial image processing field especially (and in other fields) it is common to use a special pixel value to mark geospatial areas for which no information is available. This is often called a "nodata" or "background" value. Applications often treat these pixels as transparent and they are often not included in spatial statistics for the image. Non-geospatial applications might still use the nodata value to track a special value that should be treated as transparent since currently TIFF palettes don't include an alpha value.
The GDAL_NODATA tag is intended to keep track of what pixel value is being used for this background nodata value. It is ASCII encoded so that if the pixel value 255 was to be treated as nodata, then the tag would have the value "255".
If this tag is absent there is assume to be no nodata value in effect for the image. If the image has more than one sample it is assumed that all samples have the same nodata value.
This tag is currently only supported by the GDAL library.

I will explore further into the issue.