Open JordonPhillips opened 6 years ago
How about something like this
import io
import zipfile
from chalice import Chalice
app = Chalice(app_name='merge')
@app.pipeline_function()
def second_stage(input_artifacts, user_parameters):
merged_data = input_artifacts.get('Merged')
with zipfile.ZipFile(merged_data, 'r') as zfile:
for name in zfile.namelist():
print(name)
def _add_zipfile_to_zipfile(src_zipfile, dst_zipfile, prefix=None):
for name in src_zipfile.namelist():
data = src_zipfile.read(name)
arc_name = name
if prefix is not None:
arc_name = '%s/%s' % (prefix, name)
dst_zipfile.writestr(arc_name, data)
def _add_raw_to_zipfile(name, src_data, dst_zipfile, prefix=None):
arc_name = name
if prefix is not None:
arc_name = '%s/%s' % (prefix, name)
dst_zipfile.writestr(arc_name, src_data.read())
@app.pipeline_function()
def merge(input_artifacts, user_parameters):
merged_content = io.BytesIO()
with zipfile.ZipFile(merged_content, 'w',
compression=zipfile.ZIP_DEFLATED) as dst_zfile:
for artifact_name, artifact_data in input_artifacts.items():
if artifact_name in user_parameters.get('wrap', []):
prefix = artifact_name
else:
prefix = None
try:
with zipfile.ZipFile(artifact_data, 'r') as src_zfile:
_add_zipfile_to_zipfile(src_zfile, dst_zfile,
prefix=prefix)
except zipfile.BadZipFile:
_add_raw_to_zipfile(artifact_name, artifact_data, dst_zfile,
prefix=prefix)
output_name = user_parameters.get('output_name')
if output_name is None:
output_name = 'Merged'
return {
output_name: merged_content
}
Using the branch https://github.com/stealthycoin/chalice/tree/code-pipeline-integration
Would need a lot of parameterization of the input and output artifacts to capture all the cases like custom SSE keys etc...
Problem
One of the possible action types in CodePipeline allows you to invoke external code to do some work. Pointing this action type at a Lambda function requires that your function implement some boilerplate code to handle the interaction. It would be great if Chalice could abstract as much of that away as possible.
I propose a decorator for a function which does the following:
Handles Input Artifacts
CodePipeline will pass in input artifacts as part of its event. The lambda function must then create an s3 client with given credentials, pull them down, and then unzip them.
Suggestion
For each input artifact, provide an easy method to get a corresponding file-like object or
ZipFile
.Handles Input Parameters
CodePipeline allows for passing in a configuration string to a given invoke action. While this can be anything, it often ends up being a JSON string. The lambda function must grab and parse this.
Suggestion
Automatically decode the parameters and pass them to a function. Bonus points for passing them as arguments to the function.
Handles Output Artifacts
Like input artifacts, output artifacts must be handled manually in the same way each time.
Suggestion
For each output artifact, provide a file-like object that users can write to which handles uploading to S3.
Handles Errors
CodePipeline won't recognize the function execution failing as a failure state for the job, so users have to handle them manually.
Suggestion
Chalice should catch errors, call
put_job_failure_result
, and then reraise the errors. Otherwise provide error response type objects similar to what it does for HTTP errors.Handles Timeouts
A timeout will not be registered by CodePipeline as a failure, so a lambda function must be incredibly sure that it can run in the allotted time. If it times out, CodePipe will wait for a while before giving up on the job.
Suggestion
Provide a configurable manual timeout that will call
put_job_failure_result
before the lambda function actually times out.Handles Returns
A lambda function must explicitly say it's done with a job for CodePipeline to recognize it as done. Otherwise CodePipeline will continue to wait and eventually register a failure.
Suggestion
Chalice should call
put_job_success_result
using the function's return value as the message, and also provide return objects similar to what it does for HTTP return types.Passes continuation tokens
When you call
put_job_success_result
you can pass a continuation token. If you do, CodePipeline will continue to wait until you call that function without a continuation token. It also allows you to provide things like messages and progress for each round.Suggestion
Chalice should pass in the continuation token and ensure that it is part of any success return objects it provides.
Examples
Artifact Merge Example
To show an example of some of the savings this could provide, I'll show a CodePipeline function that merges two input artifact into one output artifact both how one would implement it now and how it could look in the future.
lambda_function example
This is how one might write it currently.
pipeline_function example
This is how one might write it with Chalice support.
The function went from 110 lines to 33 lines, and added timeout safety.