aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.4k stars 253 forks source link

`lambda-controller` fails to recognize changes in function code when deploying a new package #1689

Open Vandita2020 opened 1 year ago

Vandita2020 commented 1 year ago

Is your feature request related to a problem?

A description of what the problem is. For example: I'm frustrated when [...]

The feature request is a part of solution to the problem arising in the issue #1550. To summarize the issue, when a user creates a lambda function with ACK using a file from s3 to deploy the package and then tries to use a new deployment package, the changes are not reflected.

The problem is arising due to ACK not being able to compare the change in state for Spec.Code (ex: change in Spec.Code.S3Key), which in-turn is issue because ACK Lambda API does not keep record of Spec.Code variables. ACK does not have any record of previous Spec.Code variables, used to create the function and thus when there is any change it fails to recognize it.

The output of ACK Lambda API:

$ aws lambda get-function --function-name test-function --region us-west-2
{
    "Configuration": {
        "FunctionName": "test-function",
        "FunctionArn": "arn:aws:lambda:us-west-2:322962841005:function:test-function",
        "RevisionId": "3ee79e30-bbf0-4a58-9337-b17a52dc4a6e",
         ...
    },
    "Code": {
        "RepositoryType": "S3",
        "Location": "https://awslambda-us-west-2-tasks.s3.us-west-2.amazonaws.com/snapshots/322962841005/test-function-9ac7357e-a87b-489e-ba10-351735be85ee?versionId=r12LlaOpoSHuzgrBk8Gtzx5hSTB4mUob&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEA8aCXVzLXdlc3QtMiJIMEYCIQDkj8WvX%2FglQRAPoNL7hEFV9XVVzUFfbWxcGXm94hUNnAIhAPy73r%2Fjb12oBJn5SpxB8Gnyd4M7UotoNRPCYAS7j2gWKtUECJj%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEQBRoMNTAyMjk3MDc2MTYzIgxrbESv%2FpxZYbqy8vwqqQTf4HwOan5Eli7odnwQI%2FdGE7hd6K0CjX2Bw9%2BEq72f9PWoBBeduezJrXAI%2FBw8OYPqiZv6CHNzU1cGmb%2FrCjmngvEIyJa2ROLpZSSuAClC6jX65qXvZB1VLLx2gaXX9ll8RF81OwCAM%2B2RO4M67MLSdv6USRQfFakFFTrOdk%2FrjUmofH%2F%2FTFciojHg8xJU2mqzm%2F4II7VAiEAjg%2BtqwJBEE2VY02usJWB3SAANRQOGBcx%2B0Y11aIlwxWW7ZVpjCdEtB3JvKttHfL8uqebUWa8pKw0iUaDtAf4d0weUG2M%2FUHMTvW4U4snIK6KHFVzDFLLeq3PRntaj%2BZO09dqvK586gnppME37NBUbSQ59yFA7t13tpJfFhoU9IJYGVOM0npKAbmphygJJrS3nhlxyM0Wq2j98STOLL6TKQSOGH1zHaEctqD98LolLTcLA1xgvcH9tiRY6rBjZ3OhWsMBN08rg8JT%2F9tG0VzabyT8nfdA%2FA7mp1kdPImUbnvtiwHte6sGo0nIGXQbTz2ypQs7Hsek%2FPKDJV1rQImwJo%2FNojo5agyC7iQyQKywW0bcpS26j7Haa9uXTurs7z55QqqPJkVONp68x109NGxnJ54gIkXZXmw5H7Kb3jAcw12QWwun3aKAySqdPvn36K%2BPhBGjZmYpWJ7DJIbtxHKIT7c0hli1JhSCTAIAk0PQkrFCtgt1jTxHYmfofoz7JRkSZ%2BhNcuOtzXb8l%2B9B1kIaSMIXxlZ8GOqgBRS%2F3ssabko4pgAC4cBjVPf9ioFzcQ83JnSOTWO%2Fq0jB4S6rhYZ%2FU3kqRn53uwo1nAYwcQSnoEwleT3DRGVV9uGSmPp%2F8jNx1%2BgMJ3G3Di8FqMfmJW4mcQaoFYY7uu0IB3nsoiEsO%2BGrR%2FdpPj2HgA3RvjUS%2FMXWE6%2B4CRSo37LsLnre19yuxqtczc%2BPr4%2BtjtE4OFwykCqS5Igugdspskdfsc%2Bibu31i&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230209T230900Z&X-Amz-SignedHeaders=host&X-Amz-Expires=600&X-Amz-Credential=ASIAXJ4Z5EHBQBUTO7MC%2F20230209%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=e90564a5ad7c878a96981d218313424c431c64756ac095562c39645526a2ba96"
    },
    "Tags": {
        "services.k8s.aws/controller-version": "lambda-unknown",
        "services.k8s.aws/namespace": "default"
    }
}

The API keeps record of only Code.Repository and Code.Location. We want it to keep record of Code.s3Key, Code.s3Bucket and Code.s3ObjectVersion.

Describe the solution you'd like

A description of what you want to happen.

The main goal is to store the state of s3.Key and s3.Bucket and use it to compare the change in state. We have think of three solutions to save the record for s3 variables.

  1. To include the fields in the status. Means to add it in the API response. This would be a bigger task to add the fields in the API response call.
    status:
    s3Key: X
    s3Bucket: Y
    s3ObjectVersion: Z
  2. To store it under metadata in function.yaml, metadata stores information related to the function for k8s. This way the record will be stored on K8s side, and not on AWS side.
    metadata:
    annotation:
     s3Bucket: X
     s3Key: Z  
  3. To store using tags as an array of {key, value} pairs. This could be done while describing a function in function.yaml, we can add extra tags, which API would return as normal tags but it will be storing the state of Code.s3.
    spec:
    name:
    code:
    tags:
    s3Key: X
    s3bucket: Y

    This way the information will be stored on AWS side.

Please give your thoughts and ideas on the above mentioned solutions and other ways this issue can be handled.

Describe alternatives you've considered A description of any alternative solutions or features you've considered.

a-hilaly commented 1 year ago

/cc @jaypipes @RedbackThomson @jljaco @embano1

valerena commented 1 year ago

An alternative for options 2 and 3 (although I would go more for option 2 in this case) is to instead include some sort of CodeHash or CodeSha256, that will be some sort of hash (or any value really) that if it changes, it means that the code needs to be updated. In theory can be created actually calculating the hash of the code, so regardless the actual location, we know that the code changed, so we just call UpdateFunctionCode (In practice, it can be just any value, that it's manually changed when customers change their code).

There's something like this in SAM for the code (to indicate that the code changes for Auto publishing versions) AutoPublishCodeSha256

embano1 commented 1 year ago

There's been a discussion last year about loosing the meaning of status from "populated purely by observation". IMHO a good read to weigh some options: https://github.com/kubernetes/community/pull/5842/files

In this case I'd simplify to using hashing and perhaps tracking the value in an annotation (runtime experts can jump in whether we already have simple ways for custom status fields).

a-hilaly commented 1 year ago

Happy to consider the solution of tracking information in a Status. I'm even happier with including a new field in spec.code to allow users to set their own Sha256.

Thinking something like:

..
spec:
  name: ...
  code:
    s3Key: KEY
    s3Bucket: BUCKET
    sha256: HASH

This leave us with an open question:

jessebye commented 1 year ago

Has there been any progress here? We've just started testing lambda-controller and this is a dealbreaker. We can't update our functions after the initial creation.

jessebye commented 1 year ago

~Does anyone know what the behavior is if using an OCI image? If we use mutable tags and push new versions to the same tag, will invocations of the lambda pull the latest version? Or does it cache and use the old version?~

I've discovered that using OCI image, pushing new versions to the same tag will not immediately update the function. Lambda caches the old image and only refetches it at an undisclosed interval.

Another discovery is that updating the imageURI field to use a new tag will correctly update the function to use the new tag.

a-hilaly commented 1 year ago

Hi @jessebye , we are working on a fix for spec.code.s3* changes in https://github.com/aws-controllers-k8s/lambda-controller/pull/88 - however this requires a feature in code-generator that is not supported yet. The problem here is that lambda api caches the code ins s3 and there is no way to compare latest/desired s3bucket, filename etc... we will have to support it by adding a new field spec.code.sha256 that users will have to update to trigger a code update.

If you use ECR images, ImageURI updates works fine.

timam commented 9 months ago

any update on this ?

mhr3 commented 8 months ago

Also interested in this and got a question - will we need to set the new sha field to the actual sha of the S3 object? Or will it be just a placeholder for a force-update?

Vandita2020 commented 4 months ago

The functionality to support updating deployment package for Lambda function in ACK is available now.

Vandita2020 commented 4 months ago

@mhr3 The expected behavior is for you to set the actual SHA256 of the S3 object, however it does work as a placeholder to force-update and will work even if the SHA256 doesn't match.