locationtech / rasterframes

Geospatial Raster support for Spark DataFrames
http://rasterframes.io
Apache License 2.0
240 stars 46 forks source link

JVMGeoTiffRasterSource reads TiffTags for every tile #589

Open echeipesh opened 1 year ago

echeipesh commented 1 year ago

Something in caching mechanism is going wrong and resulting in the file TiffTags being read for every tile. Attached is a CPU sample of a RasterFrames job that highlights the problem. This is also evident in overall runtime of the job as compared to using GDAL RasterSource.

Screen Shot 2022-06-27 at 3 35 33 PM

pomadchin commented 1 year ago

Can it cache it with the lazy val inside being not evaluated and never updated later?