hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.74k stars 9.1k forks source link

Terraform does not accept source_code_hash #17989

Closed ghost closed 3 months ago

ghost commented 3 years ago

This issue was originally opened by @andormarkus as hashicorp/terraform#28018. It was migrated here as a result of the provider split. The original body of the issue is below.


Hi All,

We are on Terraform 0.14.6 and experiencing the following issue. We are providing source_code_hash for the aws_lambda_layer_version in the plan terraform accepts it but writes totally different to the state file.

In the plan the source_code_hash is FyN0P9BvuTm023dkHFaWvAGmyD0rlhujGsPCTqaBGyw= however in the state file it becames c3forIEso3mJh74PY6HrhFK94GfJvQ4zG9rEIgBCBhw=.

When I check the layer in AWS CLI the "CodeSha256": c3forIEso3mJh74PY6HrhFK94GfJvQ4zG9rEIgBCBhw=,

Based on this it does not matter what kind of source_code_hash I can not overwrite hash of filename.

TF config.

  resource "aws_lambda_layer_version" "loader" {
  layer_name          = "loader"
  compatible_runtimes = ["python3.8"]

  filename         = "lambda_layer.zip"
  source_code_hash = filebase64sha256("lambda_layer.zip")
}

TF plan looks like this

 # aws_lambda_layer_version.loader will be created
  + resource "aws_lambda_layer_version" "loader" {
      + arn                         = (known after apply)
      + compatible_runtimes         = [
          + "python3.8",
        ]
      + created_date                = (known after apply)
      + filename                    = "lambda_layer.zip"
      + id                          = (known after apply)
      + layer_arn                   = (known after apply)
      + layer_name                  = "loader"
      + signing_job_arn             = (known after apply)
      + signing_profile_version_arn = (known after apply)
      + source_code_hash            = "FyN0P9BvuTm023dkHFaWvAGmyD0rlhujGsPCTqaBGyw="
      + source_code_size            = (known after apply)
      + version                     = (known after apply)
    }

However in the statefile I see the following


  {
      "mode": "managed",
      "type": "aws_lambda_layer_version",
      "name": "loader",
      "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "arn": ":",
            "compatible_runtimes": [
              "python3.8"
            ],
            "created_date": "2021-03-08T23:33:40.408+0000",
            "filename": "lambda_layer.zip",
            "id": "",
            "layer_arn": "",
            "layer_name": "",
            "license_info": "",
            "s3_bucket": null,
            "s3_key": null,
            "s3_object_version": null,
            "signing_job_arn": "",
            "signing_profile_version_arn": "",
            "source_code_hash": "c3forIEso3mJh74PY6HrhFK94GfJvQ4zG9rEIgBCBhw=",
            "source_code_size": 5391195,
            "version": "5"
          },
          "sensitive_attributes": [],
          "private": "bnVsbA==",
          "dependencies": [
            ""
          ]
        }
      ]
    },```
anGie44 commented 3 years ago

Hi @andormarkus , thank you for raising this issue and apologies you ran into this behavior. Looking at the resource code, the source_code_hash value is not directly utilized at create or read time, and thus the value stored in state after a terraform apply is the SHA-256 hash of the layer archive determined by the AWS API. Thus, for the purpose of creating a new resource, I'd recommend not configuring that attribute and allow for Terraform to create the resource by the given filename. Interestingly enough, the resource documentation https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_layer_version#source_code_hash notes this argument is designed to enable resource updates (by forcing a new resource I believe).

andormarkus commented 3 years ago

Hi @anGie44

Our deployment workflow for lambda python looks like this: The developer provides the source code of the function and puts every third party (pip) dependencies into the requirements.txt file. The CI/CD pipeline based on the requirements.txt generates a zip file for the lambda layer.

When a pip package is not a pure python package like wrapt which is needed aws-xray-sdk than pip for the C code will generate a shared object .so file. The hash of the .so file will change for every time despite I manually setting the create_ts and package version (see the package generation code below).

Due this behaviour the hash of the package will be always different despite the hash of the source code does not change. I think the behaviour of source_code_hash would be to able overwrite the default behaviour (using AWS provided CodeSha256 )

Please let me know if you need more information.

    # Install pip dependencies into package
    if [ -f "requirements.txt" ]; then
        pip3 install --quiet --upgrade --target $PKG_DIR --requirement requirements.txt
    fi

    # Removing dist-info directories. It was causing hash changes and not necessary for running
    cd $PKG_DIR
    dirs_to_remove=(*/)
    for dir_i in "${dirs_to_remove[@]}"; do
        if [[ $dir_i == *dist-info/ ]]; then
            rm -r $dir_i
        fi
    done

    # Removing pycache as well
    find . | grep -E "(__pycache__|\.pyc|\.pyo$)" | xargs rm -rf

    # Setting syntetically every file timestamp to 202001010000 in this way the zip hash wont be affected by the timestamp
    cd "$CODE_DIR/$dir"
    find . -exec touch -t 202001010000 {} \;
    zip -qr9X lambda_layer.zip $PKG_DIR

Thanks, Andor

Zeal0us commented 3 years ago

Would really like to see this changed... At present either my lambdas always update (because I use the source_code_hash), or never update when I don't, even when the code has changed. If I had control of this value, I wouldn't need to worry about it. I'm probably going to need to just do this myself and then programmatically update the source_code_hash when my own stored hash changes. Really shouldn't be necessary though :/

This issue doubles my build time (the terraform component goes from what should be nearly 0 to trying to push around 50 lambdas...)

The value returned by AWS isn't going to be useful for everyone. My lambda zip will change every single build because of unique values in the build process (due to codebuild) that are picked up by various NPM packages and inserted into various package.json files. Would be great if I could tell terraform to only look at certain files (basically my files that exist prior to my build process...).

dekimsey commented 2 years ago

With respect to aws_lambda_function as well, is this a change in the AWS or TF API? In older code, we'd set source_code_hash=data.archive_file.foo.output_base64sha256 when we create lambdas (simple python scripts packaged at runtime with an archive data source). I verified and all our existing older code indeed sets it.

Recently, while trying to implement a new lambda (via a downloaded go zip package) I kept running into a behavior where in the source_code_hash is repeatedly changing.

Given TF:

resource "aws_lambda_function" "this" {
  function_name = module.label.id
  filename = "${path.module}/${shell_script.pkg.output["filename"]}"
  #source_code_hash = base64encode(shell_script.pkg.output["hash"])
  source_code_hash = "NGI3MWUwZWQ5OGEwMmZjODBlM2ZhYzI1YzY3NmE4NjNmOWQ2NjcyMjI0Zjg1YjJjOGE5N2M3NjYwNjE5ZDdjNg=="
  runtime          = "go1.x"
  handler          = "app"
  timeout          = 300 # seconds
  role             = aws_iam_role.lambda.arn
  tags             = module.label.tags
}

The apply churns on the soucre_code_hash

  ~ resource "aws_lambda_function" "this" {
        id                             = "demo"
      ~ last_modified                  = "2021-09-23T13:51:14.495+0000" -> (known after apply)
      ~ source_code_hash               = "S3Hg7ZigL8gOP6wlxnaoY/nWZyIk+FssipfHZgYZ18Y=" -> "NGI3MWUwZWQ5OGEwMmZjODBlM2ZhYzI1YzY3NmE4NjNmOWQ2NjcyMjI0Zjg1YjJjOGE5N2M3NjYwNjE5ZDdjNg=="
    }

The computed value:

$ echo S3Hg7ZigL8gOP6wlxnaoY/nWZyIk+FssipfHZgYZ18Y= | base64 -d | xxd
00000000: 4b71 e0ed 98a0 2fc8 0e3f ac25 c676 a863  Kq..../..?.%.v.c
00000010: f9d6 6722 24f8 5b2c 8a97 c766 0619 d7c6  ..g"$.[,...f....

vs

The given value:

$ echo NGI3MWUwZWQ5OGEwMmZjODBlM2ZhYzI1YzY3NmE4NjNmOWQ2NjcyMjI0Zjg1YjJjOGE5N2M3NjYwNjE5ZDdjNg== | base64 -d | xxd
00000000: 3462 3731 6530 6564 3938 6130 3266 6338  4b71e0ed98a02fc8
00000010: 3065 3366 6163 3235 6336 3736 6138 3633  0e3fac25c676a863
00000020: 6639 6436 3637 3232 3234 6638 3562 3263  f9d6672224f85b2c
00000030: 3861 3937 6337 3636 3036 3139 6437 6336  8a97c7660619d7c6
$ sha256sum ./lambda.zip
4b71e0ed98a02fc80e3fac25c676a863f9d6672224f85b2c8a97c7660619d7c6  ./lambda.zip

This confused me mighty until I read the code and found this ticket. If it's computed and has (apparently) no real relationship to the actual archive's sha256 digest, should that be explicitly documented? In fact, the example given even goes out of it's way to compute this value. So I'm not sure what to make of this behavior or if I'm just holding it wrong (which it totally feels like right now)

richard-mck commented 2 years ago

I'm encountering the same issue using python lambdas. I've refactored our deployment process to remove any code that might be introducing unexpected changes to the hash (dist-info, pyc files etc). I've also downloaded the resulting zip files and diff'd the contents which shows they're exactly the same but are still generating different hashes.

Maybe I'm missing something obvious?

PaulF2022-55 commented 2 years ago

Even I am facing same issue with my node js lambdas . Any resolution/solution proposed by the team ? Are we expecting a fix in aws provider for this?

glg-satish-tripathi commented 1 year ago

Hi there,

I am facing the same issue with source_code_hash for lambda function, source_code_hash value is changing every time we run it and there is no actual change happened to the file. when i read the thread i got to understand why this is happening, when we can expect to be resolved? or can you please update the documentation, if it's doing a force update every time we run it ? https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_function#source_code_hash

Thanks, Satish

ldjoudi commented 1 year ago

I am facing the same issue where TF plan shows over and over again the lambda source_code_hash being changed from and original value to an updated value. src and dst source_code_hash have been the same between builds as mentioned above. I am also using using filename = data.artifactory_file.foo_zip.output_path source_code_hash = data.artifactory_file.foo_zip.sha256

Any knows about where Hashicorp stands on the fix and whether there are any workarounds to this. We are trying to push for Terraform adoption and it will be really unfortunate for this basic functionality to be broken. The goal is just to deploy the lambda that really changed every time a build is triggered.

Shahab96 commented 1 year ago

Has there been any update on this issue?

rfornea commented 1 year ago

Any updates on this? We need a way to give terraform a hash that we control so that lambdas don't get updated every time when there hasn't been a change.

Alternatively, anybody know a way to force the build process and zipping process to always produce files with the same hashes, if the file content hasn't changed?

Zeal0us commented 1 year ago

Any updates on this? We need a way to give terraform a hash that we control so that lambdas don't get updated every time when there hasn't been a change.

Alternatively, anybody know a way to force the build process and zipping process to always produce files with the same hashes, if the file content hasn't changed?

What I ended up doing was pointing terraform at an s3_bucket location and only updating if there was a change in that S3 object. I skipped building and updating that object when possible by doing the hashing myself prior to any build activity from NPM by storing the result of find ./ -type f -print0 | sort -z | xargs -0 sha256sum | sha256sum run from the lambda directory and putting it in s3, and then comparing it in a script inside of my buildspec,yml (I'm using codebuild) that does the actual build process for my lambdas and containers.

dacevedo12 commented 10 months ago

Any updates? This is a pretty old issue

oleksandrsv commented 10 months ago

My lambda is triggering drift every day.

TxMat commented 5 months ago

still an issue in 2024. any updates ?

rahul6941 commented 5 months ago

Any work around for this?

clebermasters commented 4 months ago

The issue persists despite using the latest version of Terraform, 1.8.2

github-actions[bot] commented 3 months ago

[!WARNING] This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

github-actions[bot] commented 3 months ago

This functionality has been released in v5.51.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions[bot] commented 2 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.