krux / starport

Apache License 2.0
2 stars 7 forks source link

submit pipelines using lambda #14

Closed talbright closed 5 years ago

talbright commented 5 years ago

Submit pipelines using lambda. Lambda can be used to submit pipelines via local invocation, or invoked through CodePipeline or other CI/CD systems.

This has a few nice advantages:

talbright commented 5 years ago

I was able to get this executing properly ➡️ https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/lambda/starportLambda;stream=2019/01/21/[$LATEST]a6d381fec9d8486da03717290e35ab10;start=2019-01-20T02:48:16Z

talbright commented 5 years ago

I wrote a wrapper to call the lambda and pass the args into the payload:

aws lambda invoke --function-name starportSubmitPipeline --log-type Tail --payload $lambda_payload outf

When the lamda executions though, there's an exception because lambdas can only access /tmp:

No such file or directory: java.io.IOException
java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2024)
at com.krux.starport.util.S3FileHandler$.getFileFromS3(S3FileHandler.scala:30)
at com.krux.starport.db.tool.SubmitPipeline$.$anonfun$main$1(SubmitPipeline.scala:164)
at com.krux.starport.db.tool.SubmitPipeline$.$anonfun$main$1$adapted(SubmitPipeline.scala:150)
at scala.Option.foreach(Option.scala:257)
at com.krux.starport.db.tool.SubmitPipeline$.main(SubmitPipeline.scala:150)
at com.krux.starport.lambda.SubmitHandler.handleRequest(SubmitHandler.scala:19)
at com.krux.starport.lambda.SubmitHandler.handleRequest(SubmitHandler.scala:13)

I'm looking into patching https://github.com/krux/starport/blob/master/starport-core/src/main/scala/com/krux/starport/util/S3FileHandler.scala#L15-L21 -- @realstraw any particular reason we have to stick to using that tempdir? For lambdas the only files we can write to have to be in /tmp

talbright commented 5 years ago

@realstraw this is ready for a look

talbright commented 5 years ago

Right now the processing logic is very entangled with the CLI. Ideally what we want here is a services layer that can be called from multiple downstream components: cli, play, lambda, etc. This isn't the PR to do that in though, its more of a next step if we choose to add a service layer (vs just this one-off lambda for submission.)

realstraw commented 5 years ago

Yes, the CLI is not well designed neither. Service layer is definitely the next main thing.

talbright commented 5 years ago

@realstraw can you take another look?

talbright commented 5 years ago

@realstraw hows this looking now?

talbright commented 5 years ago

@realstraw ping 😈

realstraw commented 5 years ago

LGTM, can you update the version to 5.4.0