Open eduardomourar opened 3 years ago
do you have a specific errorcode/response? i have already added a couple... the specific error you get might depend on the order of tasks in your specific project.
even in verbose, here is the output:
Resource is not in the state stackCreateComplete (123456789012 = DevAccount) ERROR: Stack billing-alarm in account 123456789012 (eu-west-1) update failed. reason: Resource is not in the state stackCreateComplete (123456789012 = DevAccount) Resource is not in the state stackCreateComplete (use option --print-stack to print stack)
there is no stack in the account target so there is not logs to be checked
Hello, we experience a similar issue. Which we believe is timing related.
This is from a fresh install; standard roles being added from 000-organization-build/*
. We are using a pipeline with a deployment account and this failed in the build phase.
What we did, added new account in organization.yml
to an OU that already existed. committed. pipeline then ran and failed.
Re-triggering the pipeline to run makes it succeed, which makes us think that it is a timing issue around account initialising or the role not configured/available yet.
Here is the sanitised build log:
INFO: Executing: include 000-organization-build/organization-tasks.yml.
INFO: Executing: update-organization organization.yml.
OC::ORG::Account | New-Account | Create (1111111111111)
OC::ORG::Account | New-Account | CommitHash
OC::ORG::OrganizationalUnit | FooOU | Attach Account (New-Account)
OC::ORG::OrganizationalUnit | FooOU | CommitHash
INFO: done
INFO: Task OrganizationUpdate execute successful.
INFO: Executing: update-stacks 000-organization-build/org-formation-build.yml organization-formation-build.
INFO: Stack organization-formation-build already up to date.
INFO: Task OrganizationBuildPipeline execute successful.
INFO: Executing: update-stacks 000-organization-build/org-formation-role.yml organization-formation-role-master.
INFO: Stack organization-formation-role-master already up to date.
INFO: Task MasterOrganizationFormationRole execute successful.
INFO: Executing: update-stacks 000-organization-build/org-formation-role.yml organization-formation-role.
ERROR: Stack organization-formation-role in account 1111111111111 (ap-southeast-2) update failed. reason: User: arn:aws:sts::222222222222:assumed-role/OrganizationFormationBuildAccessRole/OrganizationFormationBuild is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111111111111:role/OrganizationAccountAccessRole (1111111111111 = New-Account)
User: arn:aws:sts::222222222222:assumed-role/OrganizationFormationBuildAccessRole/OrganizationFormationBuild is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111111111111:role/OrganizationAccountAccessRole (use option --print-stack to print stack)
ERROR:
ERROR: ==========================
ERROR: Stopped performing task(s)
ERROR: Following tasks failed:
ERROR: - Stack organization-formation-role in account 1111111111111 (ap-southeast-2) (1111111111111 = New-Account)
ERROR: ==========================
ERROR:
ERROR: Task OrganizationFormationRole execute failed. reason: Number of failed tasks 1 exceeded tolerance for failed tasks 0.
I'm having the same issue with @caviliar but it only happens when I updated to the latest version 0.9.14, rolling back to a previous version 0.9.5 fixed the problem.
great, thanks! will fix/look into this soon. also: My understanding is that retrying the build is a workaround. is this correct?
great, thanks! will fix/look into this soon. also: My understanding is that retrying the build is a workaround. is this correct?
@OlafConijn Yes, re-triggering the build is a workaround.
looks like a bit of a mixed bag..... i have seen all sorts of different reasons that running perform-tasks after adding an account fails. including things that have been org-formation bugs and/or AWS behaviors when creating new accounts.
Not properly using DependsOn between tasks. this, for large organizations is the most common cause. As tasks are ran in parallel tasks that depend on each other need to explicitly specify DependsOn. This problem typically can be worked around by retrying.
AWS Account Creation being eventually consistent. i somehow have the feeling that @caviliar , your issue fits in this bucket. I have seen this issue more in ap regions, less in us or eu regions. some services also take longer to be initialized fully (e.g. systems manager i believe fails more often than others).
around the time this bug was posted i too had seen a somewhat weird issue where stacks failed with a status not stackCreateComplete
and after further inspection there is/was no stack. this seems to have been something temporary, have not seen this recently. have you @eduardomourar ?
dealing with these issues will continue to be a thing as it depends on AWS account creation behavior, org-formation behavior and customer configuration.
org-formation@0.9.15-beta.11 contains some improvements on this. soon will be released as 0.9.15. what i'll do is create something in the bug template to make sure the right things are added to the right GH issue so it will be easier to diagnose which specific issue is the underlying issue.
thanks!
I've quite similar issue. After I have created a new account and OU within one update, I'm getting the following ERRORs:
$ org-formation update organization.yml --profile admin
WARN: AccessDenied: unable to log into account 123123123123. This might have various causes, to troubleshoot:
https://github.com/OlafConijn/AwsOrganizationFormation/blob/master/docs/access-denied.md
WARN: AccessDenied: unable to log into account 456456456456. This might have various causes, to troubleshoot:
https://github.com/OlafConijn/AwsOrganizationFormation/blob/master/docs/access-denied.md
WARN: AccessDenied: unable to log into account 456456456456. This might have various causes, to troubleshoot:
https://github.com/OlafConijn/AwsOrganizationFormation/blob/master/docs/access-denied.md
WARN: AccessDenied: unable to log into account 321321321321. This might have various causes, to troubleshoot:
https://github.com/OlafConijn/AwsOrganizationFormation/blob/master/docs/access-denied.md
ERROR: failed executing task: Create OC::ORG::Account PropDomainDev1Account AccessDenied: User: arn:aws:iam::344143226674:user/Alex_S is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::344143226674:role/OrganizationAccountAccessRole
ERROR: error: AccessDenied, aws-request-id: 1717c7fe-5047-4000-8bc9-2e43195efebc
ERROR: User: arn:aws:iam::987987987987:user/Alex_S is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::987987987987:role/OrganizationAccountAccessRole
Alex_S has the FullAccess
role.
P.S. WARNs are permanently in place.
hi @dobeerman
i think your issue is somewhat different. looking at the logs your issue seems to be that accounts 123123123123
, 456456456456
, 321321321321
(etc) do not have the role OrganizationAccountAccessRole
provisioned.
information about the this can be found here.
this problem will indeed persist untill either:
I just saw this in our CI as well. Here's the PR that caused the failure, https://github.com/Sage-Bionetworks-IT/organizations-infra/pull/111/files.
That PR is basically adding a new AWS account, putting it in an OU and applying some budget tags similar to the ofn reference project
It failed on the first run with a similar error "Resource is not in the state stackCreateComplete". The account was created though. We re-ran the build and 2nd time there was no error.
We have created accounts with similar PRs in the past and no error on the 1st run on those. We've only created a few accounts so far therefore we can't tell how often this happens. Maybe one significant change which might have caused this issue is that we added --max-concurrent-stacks 100
and --max-concurrent-tasks 100
options to our CI's perform-tasks command. We set those to 100 because it was taking approximately 20-30 mins to create an account.
interesting @zaro0508, thanks for sharing. will look into this
Whenever adding a new account through OrgFormation, the first time the pipeline runs the creation and assign to OU will work, but it will fail in each task for that new account. I believe that AWS Organizations is not really done behind the scenes, because it succeeds in the next build run after a few minutes.