WikiWatershed / mmw-geoprocessing

A Spark Job Server job for Model My Watershed geoprocessing.
Apache License 2.0
6 stars 6 forks source link

Implement Multi-Operation Endpoint for Subbasin Modeling #85

Closed rajadain closed 6 years ago

rajadain commented 6 years ago

Overview

Implements the MultiOperation outlined in #80 and #82. This accepts a JSON request specifying multipls shapes and operations, performs all of them, and returns a single JSON result.

Connects #80

Demo

Some initial timing:

$ for i in 1 2 3 4 5; time -p http --timeout=90 :8090/multi < examples/MultiOperationRequest.json 2>&1 > /dev/null | grep real | awk '{print $2}'; end

33.89
21.79
19.57
21.12
18.59

Notes

I expect this PR to go through multiple revisions as we tweak performance.

The sample output was generated by:

$ http --timeout 90 :8090/multi < examples/MultiOperationRequest.json | jq . > examples/MultiOperationResponse.json

Testing Instructions

rajadain commented 6 years ago

Here's the comparison chart for the latest cleanup:

image

with the final results being the average. While "Civic Apps Cleanup" does take slightly longer, I'm not sure what the cause of it in the code, since the type aliases should be handled at compile time. The only runtime code change I made was changing this map to mapValues:

val result1: Map[String, Future[Map[String,Map[String,Double]]]] =
  result0.map { case (k, v) => k -> sequenceMap(v) }

to

val result1: Map[HucID, Future[Map[OperationID, Map[String, Double]]]] =
  result0.mapValues(sequenceMap)

and even though there is some discussion on map vs mapValues it doesn't seem to allege performance differences. I'm chalking it up to noise.

This is ready for one final review.

rajadain commented 6 years ago

Thanks for all the input! This turned out great. Added a final commit to remove the extraneous utility functions we ended up not using (they're safe in the git history should we ever need to look back to them). Merging now.