elastic / elastic-serverless-forwarder

Elastic Serverless Forwarder
Other
35 stars 34 forks source link

Create pipeline to push zip file with dependencies to an S3 bucket #683

Closed constanca-m closed 2 months ago

constanca-m commented 3 months ago

Description

This issue comes from this comment thread of a PR to use terraform to install ESF.

The current approach for the terraform files:

The desired approach: have all dependencies in a zip file and push this to an S3 bucket.

Steps

Step 1

Create a new buildkite pipeline in this directory.

Each version release (or commit?) triggers the creation of a new zip file with all the dependencies. This zip file needs to be pushed to an S3 bucket that will be used by customers. The S3 bucket needs to be read only.

The zip file will have the following structure:

Reference: https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies.

Step 2 Refactor the terraform files:

resource "aws_lambda_function" "esf" {
// stuff

  s3_bucket = aws_s3_bucket.esf_bucket.id
  s3_key    = aws_s3_object.esf_zip_bundle.key

// stuff
}

_Originally posted by @girodav in https://github.com/elastic/terraform-elastic-esf/pull/1#discussion_r1516188875_

Tasks

  1. Add new workflow to automate release: https://github.com/elastic/elastic-serverless-forwarder/pull/685
  2. Trigger new workflow to push dependencies to S3 bucket in case of new release: https://github.com/elastic/elastic-serverless-forwarder/pull/689
  3. Add new dependencies: https://github.com/elastic/elastic-serverless-forwarder/pull/692
constanca-m commented 3 months ago

Hey @girodav and @axw , can I have your thoughts on this to make sure everything is correct?

girodav commented 3 months ago

Hey Constança, thanks for opening this issue. Some comments below.

Create a new buildkite pipeline in this directory.

I don't think there is any need to create a Buildkite pipeline, since ESF does not need to be released as part of the Elastic stack. So feel free to keep using Github Actions as we already do, unless you find some benefit in moving to Buildkite.

Each version release (or commit?) triggers the creation of a new zip file with all the dependencies. This zip file needs to be pushed to an S3 bucket that will be used by customers.

We currently track releases with git tags, so the workflow could be triggered by the creation of a new git tag. We also track the version in version.py, which is currently updated manually. There is already a related issue about how to automate updates on this file and how to handle version bumps in general https://github.com/elastic/elastic-serverless-forwarder/issues/540. I'd consider it as a preliminary task for this issue.

I would also make sure that the solution is extensible enough to be able to add automated deployment to SAR as well, in a future release.

Remove the module for lambda. Insert a new resource aws_lambda_function that reads from the S3 bucket with the zip file:

This is more like an option, the current AWS Lambda Terraform module terraform-aws-modules/lambda/aws can still be used if it simplifies the implementation. It just need to be modified to use pre-built packages stored on S3

https://registry.terraform.io/modules/terraform-aws-modules/lambda/aws/latest#lambda-function-with-existing-package-prebuilt-stored-in-s3-bucket

Where should the S3 bucket be placed? Under which account? In any specific region?

You can use the same account where we store SAR artifacts.

Do all packages used in import statements in the handlers files need to be in the dependency zip?

Technically no, the AWS Lambda Python runtime already includes some of them (e.g boto3). However, we should stick to what is on requirements.txt to be sure to use the same versions everywhere. You should include only the dependencies used at runtime (i.e only requirements.txt). This part also depends on whether https://github.com/elastic/elastic-serverless-forwarder/issues/204 is going to be prioritized or not.

constanca-m commented 3 months ago

Thank you @girodav for such a detailed answer. I am working on setting a workflow on github actions like you mentioned. It seems a bit tricky to test, so I will do it in a private repository first and then I will open a PR and link it to this issue as well as to https://github.com/elastic/elastic-serverless-forwarder/issues/540.

It won't be taking of the SAR currently but it seems easy to adapt the workflow if setting the right trigger:

on:
  push:
    branches:
      - 'main'
    paths:
      - 'version.py'