locationtech / rasterframes

Geospatial Raster support for Spark DataFrames
http://rasterframes.io
Apache License 2.0
246 stars 45 forks source link

Invalid parameters error in `rf_render_png` #320

Open vpipkt opened 5 years ago

vpipkt commented 5 years ago

Context: jupyter notebook and conda environment with pyspark 2.3.3 and pyrasterframes 0.8.1 (pip installed) and GDAL 2.4.2

Read the csv catalog attached render_png_catalog.csv.zip

Read raster datasource with catalog_col_names = ['B7', 'B5', 'B1']

import pyrasterframes.rf_ipython
df.select(rf_render_png('B7', 'B5', 'B1'))

generates error below:

Py4JJavaError: An error occurred while calling o56._dfToMarkdown.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 3.0 failed 1 times, most recent failure: Lost task 2.0 in stage 3.0 (TID 8, localhost, executor driver): java.lang.RuntimeException: Invalid parameters: 0, 0, 0, 255
    at scala.sys.package$.error(package.scala:27)
    at geotrellis.raster.Tile$class.normalize(Tile.scala:246)
    at org.locationtech.rasterframes.ref.RasterRef$RasterRefTile.normalize(RasterRef.scala:63)
    at geotrellis.raster.Tile$class.rescale(Tile.scala:272)
    at org.locationtech.rasterframes.ref.RasterRef$RasterRefTile.rescale(RasterRef.scala:63)
    at org.locationtech.rasterframes.expressions.transformers.RGBComposite.nullSafeEval(RGBComposite.scala:88)
    at org.apache.spark.sql.catalyst.expressions.TernaryExpression.eval(Expression.scala:594)
    at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:359)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409
danlewis85 commented 3 years ago

I encountered this bug in 0.9.1 - I found that in my case it was (possibly) associated with masked tiles. The rf_render_png function ran fine on my data, but once I'd applied a cloud mask using the rf_mask_by_values function rf_render_png would no longer work, throwing the error seen above.

If I had to guess, I think it might be failing on the tile normalisation step, because there is no range of values in the tile (e.g. max and min are both 0) and/or no data values are being represented in some way that is incompatible. So in my case it could be that a tile is completely masked by cloud and it can't normalise a single constant value. When I don't mask, the cloudy tile still has some variation in cell values so it works.