Lambda@edge needs environment free code

pocesar commented 4 years ago

The function cannot have environment variables. Function: arn:aws:lambda:us-east-1:xxxx:function:xxxx:1 (Service: AmazonCloudFront; Status Code: 400; Error Code: InvalidLambdaFunctionAssociation; Request ID: xxxx)

Lambda@Edge has a code heuristic that disallows using process.env inside it's handler code. which raises two points:

Almost sure you can't blindly bundle external libraries that most likely have process.env mentions somewhere
The way that code is passed to Cloudfront.lambdaFunctionAssociations is a special case (either by using JSON/YAML S3 that we discussed earlier for the configuration or an adjacent lambda that can return those per request). Using a global datastore for "punchcard things" might cover other issues that might arise

sam-goodwin commented 4 years ago

Can we instead use tags and query them on container-boot in Lambda?

Birowsky commented 4 years ago

Sam, you gotta agree that any extra work/waiting in @edge counts against an already tight running window. I don't know the inner workings of Punchcard, but Webpack allows us to embed env vars at build/ci time. I highly suggest that or similar approach. It is how I build my frontends.

sam-goodwin commented 4 years ago

@Birowsky, Punchcard relies heavily on environment variables set with CloudFormation. If you use a DynamoDB Table, the table's ARN is automatically set in the environment variables and then looked up at runtime to configure a "Table Client" instance.

This is how we auto-wire dependencies:

new Function(stack, 'f', {
  depends: table.readAccess(), // grants IAM policies and sets tableARN in the environment variables
}, async (request, table) => {
  // table's ARN is automatically fetched from the environment variables
  await table.get('key'); // no need to specify table name
});

You're right though, using tags unfortunately adds to the cold start cost :(. It would only be on the first invocation of a container, I wonder if that's tolerable? Perhaps we can lazily fetch the information so you only incur the cost on the first request to another service, i.e. lookup a Table's ARN on the first DDB get item. For us to be a zero-cost abstraction (a good tenet for Punchcard), we'd have to somehow fetch and include this information in the Lambda's archive as part of the CloudFormation Update. A Custom Resource that modifies the lambda's zip file might be the best solution?

sam-goodwin commented 4 years ago

Physical names are also an option.

Birowsky commented 4 years ago

It would only be on the first invocation of a container

I'd be totally fine with this if it actually happened only once. But I'm quite certain there are multiple first-time container runs. Especially with @edge, and all the edge CDN locations.

So I honestly root for you figuring out a mechanism that would hardcode all resource identifiers inside the generated code.

Maybe a multi-step deployment? First figure out identifiers, then another run for the resources dependent on them? Modifying the built zip seems simpler, but error prone?

sam-goodwin commented 4 years ago

I agree. I think we could use a custom resource for edge functions alone and still rely on environment variables for ordinary ones - we shouldn't bloat the stack with unnecessary custom resources when we can avoid it. CFN's limits are quite low.

sam-goodwin commented 4 years ago

Instead of modifying the Zip, perhaps we can create a Lambda Layer?

Birowsky commented 4 years ago

The docs here say that layers are not supported in edge. Or were you thinking about something else?

But anyways, how exactly were you thinking of utilizing a custom resource to include the env vars? Is it your method of modifying the zip? If you think you've found a safe way to embed the env vars in the zip, I really don't mind it! Maybe you can set up some unique string placeholders?

sam-goodwin / punchcard

Lambda@edge needs environment free code #98