data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
231 stars 82 forks source link

Merge 2.6.0 to 2.0.0 #1508

Closed shivalkarrahul closed 1 month ago

shivalkarrahul commented 1 month ago

Hello Team,

We are working on merging 2.6.0 to 2.0.0 along with our custom code changes in 2.0.0. We resolved all the conflict and merged the code.

Now, we are facing an issue in "Assets" stage. (There could be more issues, it's just that our Assets stage failed now)

Error:

verbose: Loaded manifest from assembly-dc5-poc-cicd-stack-dc5-ops5-backend-stage/dc5poccicdstackdc5ops5backendstagebackendstackA47E2DD7.assets.json: 24 assets found 36 | verbose: Applied selection: 1 assets selected. 37 | info : [0%] start: Publishing <filename>:<infra-account-id>-<region> 38 | verbose: [0%] check: Check s3://cdk-hnb659fds-assets-<infra-account-id>-<region>/<filename>.zip 39 | error : [100%] fail: Missing credentials in config, if using AWS_CONFIG_FILE, set AWS_SDK_LOAD_CONFIG=1 40 | Failure: CredentialsError: Missing credentials in config, if using AWS_CONFIG_FILE, set AWS_SDK_LOAD_CONFIG=1

Could you please help here? Attaching the pipeline.py for you reference.

main-pipeline.txt - > This contains our pipeline.py from 2.0.0 along with the custom changes. This works fine for us. 2.6-pipeline.txt -> This is the pipeline.py from 2.6.0 poc-pipeline.txt -> This is the pipeline.py that contains the merged code.

Note: We are using Github as our souce code repo.

2.6-pipeline.txt main-pipeline.txt poc-pipeline.txt

Also, would be glad to know the better way to perform the upgrade activity where we have custom code in 2.0.0 and want to upgrade to 2.6.0.

Thank you in advance.

dlpzx commented 1 month ago

Hi @shivalkarrahul thanks for opening an issue. At first sight, the issue seems to be related to an override in the cdk config files. The "Assets" stage is a generic stage provided by CDK Pipelines that uploads the synthesized cloudformation templates into the CDK bucket of all the used AWS accounts. In the logs above, it is uploading the backend stack template into the S3 bucket for the account where the infra is deployed 80xxxxxxxxxx (I removed the file as it was containing AWS account ids).

We need to ensure: 1) CDK trust relationship. The infra account 80XXX needs to be trusting the tooling account where this pipeline is deployed 2) cdk.json account is the one we want to use 80XXX 3) cdk.context.json is filled with details for the tooling as well as infra accounts

As it is an error in Assets (nothing that data.all modifies) I would first check this small items, let us know if it helps and if it does not we will take a closer look at the changes.

shivalkarrahul commented 1 month ago

Hello @dlpzx ,

Thank you for the quick response.

  1. We have 2 Roles in the Infra Account a. cdk-hnb659fds-deploy-role-infra-account-number-eu-west-1: We have our Infra in eu-west-1 region. This role has 2 trust relationships(arn:aws:iam::infra-account-number:root, and arn:aws:iam::tooling-account-number:root)

    b. cdk-hnb659fds-deploy-role-infra-account-number-us-east-1: We don't have any infra here.

  2. I can see 2 buckets in the Infra Account a. cdk-hnb659fds-assets-infra-account-number-eu-west-1: Assets stage is trying to write here. b. cdk-hnb659fds-assets-infra-account-number-eu-west-1

  3. cdk.json and cdk.context.json: These files seem correct. Attaching them here for your reference: a. cdk.json b. cdk.context.json

Please let me know your thoughts. Your help would be appreciated.

Thank you.

dlpzx commented 1 month ago

Hi @shivalkarrahul, thanks for looking it up :)

Digging a bit on your issue I ran into this stackoverflow post that seems very similar to what your issue might be.

For us-east-1 (if you use internet facing frontend) and for eu-west-1 there should be a CloudFormation stack called CDKToolkit with the parameters: trust and trusts for lookup trusting the tooling account. You can check the trust in the Parameters tab of the CloudFormation stack (you can also edit the stack and add more params)

Screenshot 2024-09-03 at 13 56 00

As a result there should be for each region 5 IAM roles. All of them should have the tooling account in their trusts policies except for the cfn-exec-role (that role is a service role with very open permissions assumable only by Cloudformation)

Screenshot 2024-09-03 at 13 56 59

I do NOT recommend to delete and recreate the CDKToolkit, because it can lead to issues in the CDKPipelines permissions to buckets. I always recommend to re-run the cdk bootstrap command or update the parameters of the CloudFormation stack.

I know it is going back and forth on the same, but could you confirm the above?

shivalkarrahul commented 1 month ago

Hello @dlpzx ,

Thanx for your response.

I can confirm that:

  1. Infra Account : a. This has 5 IAM Roles for each region. b. 5 IAM Roles for us-east-1: This has trust for Tooling and Infra Account for 4 Roles except for cfn-exe as expected. c. 5 IAM Roles for eu-west-1: A few roles have trust for only Infra Account. Policy contains Trust for the Infra Account ARN in the principal section twice.
Screenshot 2024-09-03 at 5 54 30 PM

We realised this in one of the initial logs and changed manually for deploy role only, as we were not aware for 5 roles in total.

    Initial Error: Resource handler returned message: "arn:aws:iam::tooling-account-number:role/dc5-poc-cicd-stack-role-name is not authorized to perform AssumeRole on role arn:aws:iam::infra-account-number:role/cdk-hnb659fds-deploy-role-infra-account-number-eu-west-1
  1. Is it Ok if we change the Trust in the Roles manually?
  2. Do Roles in the Tooling Account also need trust for Infra Account? Because 4 Roles in the Tooling Account contains Trust only for itself that too twice.
  3. We are unsure how this got changed.

Thank you.

shivalkarrahul commented 1 month ago

Hi @dlpzx ,

We could resolve the issue in the "Assets" stage by just adding Tooling Account trusts in the 4 Roles in the Infra Account, modified the roles manually.

Will get back to you if any help is required.

Thank you so much.

dlpzx commented 1 month ago

Hi @shivalkarrahul, glad to hear you could resolve the issue. It must have been some overlap between cdk-bootstrap commands. I'll close this issue and if you need any more support, please do not hesitate to open another issue. Bests!