Serverless geoprocessing platform for SeaSketch
sls create --name MyReports --template-url https://github.com/mcclintock-lab/seasketch-sls-geoprocessing/tree/master/template
cd MyReports
npm install
sls add_function -n area
# modify functions/area/index.js as needed to implement the geoprocessing function
sls invoke local -f area -p functions/area/examples/sketch.json
When creating a new function using sls add_function
, some scaffolding will be generated that enables the new function and encourages best practices. The geoprocessing function itself can be found in functions/${name}/index.js
, tests should be put into index.test.js
, and an area to store source and distributed data is created. Much of this is a work in progress.
Geoprocessing functions must return a Promise. Implementing services this way sets up more reliable error handling and a uniform interface for handling async calls to services such as s3. The handler
middleware in combination with the reporting.seasketch.org service handles most of the plumbing to fulfill the seasketch-next report request protocol, so it's only necessary for the developer to concern themselves with the analysis itself.
The handler that wraps each geoprocessing function will take care of ensuring cached results are served up. For additional outputs such as map data you will be able to import putS3
from the framework [todo]. Data will be saved to a publicly accessible but unlisted bucket, and the url will be returned so it can be included in the results.
A data/
folder is created by the function generator and contains a data/src
folder for raw data like shapefiles. Ideally, source data is included in the repo along with a script to generate files like geojson or spatial indexes that are ready to use and stored in data/dist
. Source files will be committed to the repo using LFS and will not be distributed to lambda [todo], so there's no need to worry about file size in data/src
.
A lot of the work in developing geoprocessing functions will be in figuring out the best way to represent data. Raw data in geojson format may be used by turf.js but in many cases some preprocessing will need to be done to add the data to something like rbush.
For now, sls invoke local -f area -p functions/area/examples/sketch.json
will have to be used until studio is available. data/examples
should be used to store geojson representations of test sketches.
npm test
will run jest but currently no affordances are made for bootstrapping the environment that the handler needs [todo].
Much like in the original seasketch-reporting-api the author of a report will need to create client code that visualizes outputs. Eventually a couple packages will be available to facilitate this work.
@seasketch-sls-geoprocessing/client
will contain a set of React components that form a core UI library to speed the development of new reports.
@seasketch-sls-geoprocessing/studio
will be a development server that hosts a UI for running geoprocessing functions and visualizing these reports. The goal will be to create a tight code->eval->visualize loop that will be more efficient than using the production seasketch app.
There are actually a lot of reports that this may be appropriate for. We'll need a special webpack target to enable this.
seasketch-sls-geoprocessing
contains a serverless template and a plugin. The template just has the basics of a module and a package.json that includes the necessary dependencies, including itself as a plugin. The plugin enables the add_function
serverless command which adds some directory scaffolding + entries into serverless.yml. The "magic" is kept to a minimum. It's just serverless configuration in the end so there's a lot of flexibility. Dependencies like @seasketch-sls-geoprocessing/client
and @seasketch-sls-geoprocessing/studio
will be kept seperate so they don't have to be deployed to lambda unnecessarily. Eventually as we develop analytical tools that may be reusable they should be added to indivual packages under the @seasketch-sls-geoprocessing
organization.
The plugin modifies a geoprocessing project's serverless.yml
file during packaging. It shouldn't clobber values set explicitly in a project's configuration but it's necessary to be aware of these settings at times. These values are merged via the plugin rather than inserted into the serverless.yml
generated by the add_function
task so that as infrastructure changes this configuration can updated by simply upgrading to the latest version of the plugin.
provider:
environment:
RESULTS_SQS_ENDPOINT:
Fn::ImportValue: ReportResultsQueueEndpoint
S3_BUCKET:
Fn::ImportValue: ReportOutputs
iamRoleStatements:
- Effect: Allow
Action:
- "sqs:*"
Resource:
Fn::ImportValue: ReportResultsQueueArn
custom:
webpack:
webpackConfig: './webpack.config.js'
includeModules: true
logForwarding:
destinationARN:
Fn::ImportValue: ReportLogsForwarder
Each geoprocessing project has it's own dependencies defined in package.json
and at times it may be necessary to upgrade them in order to be compatible with @seasketch-sls-geoprocessing/client
or seasketch-sls-geoprocessing
itself. When that time comes, manually update devDependencies
to match those in the latest template at seasketch-sls-geoprocessing/template/package.json. You may also have to do the same for webpack.config.js.