locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.33k stars 362 forks source link

Improve RasterSummary.fromRDD signatures #3469

Open pomadchin opened 2 years ago

pomadchin commented 2 years ago

At this point the RasterSummary.fromRDD (and similar methods) have the folowing signature:

object RasterSummary {
  // ...
  def fromRDD(rdd: RDD[RasterSource]): RasterSummary[Unit] = ???
  // ...
}

Due to the fact that RDD is invariant, it is pretty hard and inconvenient to use:

// explicit upcast
val sourceRDD: RDD[RasterSource] = sc.parallelize(files).map(GeoTiffRasterSource(uri).reproject(targetCRS): RasterSource)
val summary = RasterSummary.fromRDD(sourceRDD)

What could be nice is to allow subtypes of the RDD to be accepted by RasterSummary methods:

// with no explicit upcast
val sourceRDD = sc.parallelize(files).map(GeoTiffRasterSource(uri).reproject(targetCRS))
val summary = RasterSummary.fromRDD(sourceRDD)

We can work around RDD invariance by defining the following functions singnature (just an example):

def fromRDD[T](rdd: RDD[T])(implicit rs: T => RasterSource): RasterSummary[Unit] = {
  val all = collect(rdd.map(rs(_)), _ => ())
  require(all.size == 1, "multiple CRSs detected")
  all.head
}