locationtech / rasterframes

Geospatial Raster support for Spark DataFrames
http://rasterframes.io
Apache License 2.0
240 stars 46 forks source link

Performance Issue #616

Open arind123 opened 11 months ago

arind123 commented 11 months ago

I am trying to create a data processing pipeline that start with a catalogue of multiple Sentinel 2 tiles and multiple 10mt bands (Data worth almost 1 year so around 250 Sentinel 2 Products and 250*5 number of tif files). On that I am applying resampling, Local Algebra, Masking etc. and the same thing I am doing on some bands of 20mt resolutions. An Finally join and save the final data frame as parquet.

This whole process is taking around 15 hours to complete in Spark, Is that what it is supposed take in Rasterframes ???