One limitation that exists in GeoPySpark API is the inability to express multiple operations on a single tile and combine the result in line. For instance both focalSum and Slope are implemented operations but computing both over TiledRasterLayer would require producing two RDDs and joining them by key.
One way around this limitations is to introduce API like this:
The Bands object only captures the structured of the expression such that it can be interpreted and evaluated in an RDD.mapValues step. MAML is a natural choice for this. We could construct either the JSON or the JVM expression through through the gateway.
class Bands(object):
def __index__(self, i):
return Band(src=BandSource(band=i))
class Band(object):
def _json:
return """{'source': }"""
def slope(self):
return Band(src=self, op="slope")
This becomes relevant to working over multiple raster layers when we have functions to select and combine bands:
# work across two (or more) layers
joined = gps.joinBands(tiled_layer1.bands(1,2), tiled_layer2.bands(3,0), tiled_layer3.bands(0))
joined.evalBands( lambda bands: Bands =>
return [
bands[0]\
.slope()\
.localSum(band[2])\
.focalSum(n=2)\
.crop(2) ,
bands[1]\
.slope()
] # MAML => f => Array[Tile]
)
One limitation that exists in GeoPySpark API is the inability to express multiple operations on a single tile and combine the result in line. For instance both
focalSum
andSlope
are implemented operations but computing both overTiledRasterLayer
would require producing two RDDs and joining them by key.One way around this limitations is to introduce API like this:
The
Bands
object only captures the structured of the expression such that it can be interpreted and evaluated in anRDD.mapValues
step. MAML is a natural choice for this. We could construct either the JSON or the JVM expression through through the gateway.This becomes relevant to working over multiple raster layers when we have functions to select and combine bands: