aws-solutions / instance-scheduler-on-aws

A cross-account and cross-region solution that allows customers to automatically start and stop EC2 and RDS Instances
https://aws.amazon.com/solutions/implementations/instance-scheduler-on-aws/
Apache License 2.0
548 stars 266 forks source link

ASGHandler - ClientError (AccessDenied) when AssumeRole #551

Closed FugroEgger closed 4 months ago

FugroEgger commented 4 months ago

Describe the bug Stack region: eu-west-1 Stack-Name: cs-instance-scheduler TagName: scheduler_period UsingAWSOrganizations: Yes regions: ap-southeast-1,ap-southeast-2,eu-central-1,eu-west-1,eu-west-3,eu-north-1,me-south-1,me-central-1,us-east-1,us-east-2,us-west-1

ASG scheduling works for the accounts using it.

Control Tower setting prevent the default VPC creation, but ddoes not prevent iam actions The spoke account (1111111111) in this error does not use ASG groups or any EC2 resources in any supported regions. VPC:0

[ERROR] ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::999999999999:assumed-role/cs-AsgRequestHandler-Role/cs-instance-scheduler-ASGHandler0F6D6751-CZoivUbvaNX1 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::111111111111:role/cs-ASG-Scheduling-Role
Traceback (most recent call last):
  File "/var/task/aws_lambda_powertools/logging/logger.py", line 450, in decorate
    return lambda_handler(event, context, *args, **kwargs)
  File "/var/task/instance_scheduler/handler/asg.py", line 68, in lambda_handler
    [num_tagged_auto_scaling_groups, num_schedules] = schedule_auto_scaling_groups(
  File "/var/task/instance_scheduler/handler/asg.py", line 125, in schedule_auto_scaling_groups
    session: Final = assume_role(
  File "/var/task/instance_scheduler/util/session_manager.py", line 95, in assume_role
    raise ex
  File "/var/task/instance_scheduler/util/session_manager.py", line 75, in assume_role
    token: Final = _sts().assume_role(
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)

To Reproduce i can't reproduce it in our development environment

Expected behavior no errors as ASG scheduling works for the accounts using it.

Please complete the following information about the solution:

Screenshots none

Additional context All accounts are managed by Controltower, Control Tower setting prevent the default VPC creation, so supported scheduling regions can have 0 VPC No SCP prevent IAM actions

CrypticCabub commented 4 months ago

does this spoke account also have the 3.0.0 version of the spoke stack installed? This would be an expected error when 3.0.0 tries to assume into the roles created by a 1.5.6 spoke stack.

FugroEgger commented 4 months ago

you are correct. that was the issue, its working now. Thanks for the quick reply Our cloudformation scheduler remote stack update to 3.0.0 didn't run in some accounts.

I appreciate your work on this great solution