Closed rajadain closed 8 years ago
Checked this out to test an outstanding PR and wanted to know if it was worthwhile bumping the version to 1.2 for these changes? Not sure how closely we're trying to hew to semver.
The version will be bumped to 1.2 once this is merged.
👍
Instead of having a separate RasterAverage
operation, I think it should just be an instance of RasterGroupedAverage
where rasters
is an empty list.
Also added RasterGroupedAverage
. The code to handle grouping vs non-grouping is different enough to warrant specifying different operation names, I think.
This can be tested with
{
"input": {
"polygon": [
"{\"type\":\"MultiPolygon\",\"coordinates\":[[[[-75.1626205444336,39.95580659996906],[-75.25531768798828,39.94514735903112],[-75.22785186767578,39.89446035777916],[-75.1461410522461,39.88761144548104],[-75.09309768676758,39.91078961774283],[-75.09464263916016,39.93817189499188],[-75.12039184570312,39.94435771955196],[-75.1626205444336,39.95580659996906]]]]}"
],
"polygonCRS": "LatLng",
"targetRaster": "us-ssugro-kfactor-30m-epsg5070",
"rasters": [
"nlcd-2011-30m-epsg5070-0.10.0"
],
"rasterCRS": "ConusAlbers",
"operationType": "RasterGroupedAverage",
"zoom": 0
}
}
for which I get a result
{
"result": {
"List(11)": 0.0008092413746841691,
"List(21)": 0.028911406301584154,
"List(22)": 0.01676576752450678,
"List(23)": 0.016573442365171852,
"List(24)": 0.011148141032012355,
"List(31)": 0.0,
"List(41)": 0.060527942751493584,
"List(43)": 0.0,
"List(52)": 0.012717949322018867,
"List(71)": 0.0007272727618163282,
"List(81)": 0.0,
"List(82)": 0.0,
"List(90)": 0.03929729525007667,
"List(95)": 0.006490566346021193
}
}
Okay, I'll check it out.
Does abstracting the distinct pixel counting logic in getDistinctPixels(multiPolygons, extent, rasterExtent) affect performance? I hope one function call per tile would add a very small amount to execution (and maybe even get inlined by the compiler?).
All of this stuff ends up getting (re)compiled in a profile-guided way -- that is part of what is responsible for the "warm up" effect that we see.
Instead of having a separate RasterAverage operation, I think it should just be an instance of RasterGroupedAverage where rasters is an empty list.
I agree.
Just as a general comment, MapshedJob.scala
is starting to get pretty big, with a fair amount of similarity between parts. Aside form Lewis' suggestion, I am not sure what can easily be done to address that right now, but it is something to keep in mind.
If Lewis' suggestion can be easily implemented, I think it should be. Other than that, +1.
Just did a bunch of refactoring. I'm sure more could be done, but I'm not well-versed enough in Scala to actually do it.
The main changes from the API perspective are as follows:
RasterAverage
has been removed, instead now you must send RasterGroupedAverage
with an empty list in rasters
RasterJoin
has been renamed to RasterGroupedCount
Please update your respective PRs if they use these methods.
There still seems to be quite a bit of repetition (or at least rhyming) but no realistic ideas for addressing it area springing immediately to mind for me, either.
I agree. Unfortunately, we need these endpoints for other cards to be unblocked. I'm going to clean these commits and merge shortly after.
Cool.
Overview
Introduces two new endpoints:
RasterGroupedSum
which given a polygon, a list of rasters, and a target raster, will return the sum of value of the target raster grouped by the values of the list of rasters, clipped to the polygon.RasterAverage
will simply average the value of the target raster clipped to the polygon.Testing Instructions
Test
RasterGroupedSum
with:The output of Spark JobServer should be:
Test
RasterAverage
with:The output of Spark JobServer should be:
Notes
Does abstracting the distinct pixel counting logic in
getDistinctPixels(multiPolygons, extent, rasterExtent)
affect performance? I hope one function call per tile would add a very small amount to execution (and maybe even get inlined by the compiler?).Connects #34 Connects #35