Open markgov opened 3 days ago
Rich Green I shouldn't have but , I've gone further down the rabbit hole on the API error we were looking at earlier. I found that when I ran the secure-baselines code locally I was able to trigger the same alert in Cloudwatch that we've been seeing. I looked at debug logs in terraform and got the following: 2024-07-02T12:26:49.627+0100 [DEBUG] provider.terraform-provider-aws_v5.45.0_x5: HTTP Response Received: aws.region=eu-west-2 http.duration=201 http.response.body= | {"__type":"AccessDeniedException","Message":"You don't have permissions to access this resource."} rpc.system=aws-api tf_aws.sdk=aws-sdk-go tf_provider_addr=registry.terraform.io/hashicorp/aws @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.53/logger.go:157 http.response.header.x_amzn_requestid=37314104-42fb-4607-ada2-7f8fd4c4d9e2 @module=aws http.response.header.content_type=application/x-amz-json-1.1 http.response.header.date="Tue, 02 Jul 2024 11:26:49 GMT" http.status_code=400 tf_data_source_type=aws_organizations_organization tf_rpc=ReadDataSource http.response_content_length=98 rpc.method=ListAccounts rpc.service=Organizations tf_aws.signing_region=us-east-1 tf_mux_provider="*schema.GRPCProviderServer" tf_req_id=9c6f6f4a-ca2e-63b8-05a1-e3904f10a9bd timestamp="2024-07-02T12:26:49.627+0100" 12:53 So I think this is the troublemaker... https://github.com/ministryofjustice/modernisation-platform/blob/main/terraform/environments/bootstrap/secure-baselines/locals.tf#L3 I'm trying to work out where it's getting used I think in the config module in the baselines module
This looks like a pretty simple fix? One option would be to reference a separate provider such as the sso-readonly
role and pass it in to the module?
provider "aws" {
region = "eu-west-2"
alias = "sso-readonly"
assume_role {
role_arn = "arn:aws:iam::${local.environment_management.aws_organizations_root_account_id}:role/ModernisationPlatformSSOReadOnly"
}
}
Alternatively it might be possible to investigate the role/policy being used here and add the missing IAM permission: organizations:ListAccounts
The strange thing about this one is it looks like a lot of our MP code uses the aws_organizations_organization
datasource as it's a useful way to query details about the organisation from a member account.
The github-actions
role that is running the baselines already has "Administrator" privileges so adding organizations:ListAccounts
doesn't seem necessary.
Weirdly if I run some of the other directories that use this datasource e.g. terraform/environments/bootstrap/member-bootstrap
then there are no API errors in cloudtrail 🤔 (but the verbose logs I get locally still show an error like I posted above)
If I run it from terraform/environments/bootstrap/secure-baselines
I do get the error, but if i replace that data call with feeding in the root account ID explicitly then it doesn't cause the error.
So.. we could change the way we get that data, like you have suggested @dms1981 - but would it be weird to do it just in one place where it's failing? We use it a lot... I still can't get my head around exactly why it's just causing those API failures when secure-baselines runs 🤔
User Story
As a Modernisation platform engineer I expect the secure baseline not to create alerts in the low priority channel So that we can have a clean channel
Value / Purpose
After looking at the alerts in the low priority alarms channel https://moj.enterprise.slack.com/archives/C02PFCG8M1R after looking into the alarms we have found that the secure baseline is generating 193 alerts in cloud trail with an error message of
This needs to be looked into and corrected
Useful Contacts
David Elliot, Richard Green, Mark Roberts
Additional Information
No response
Definition of Done