aws-solutions / instance-scheduler-on-aws

A cross-account and cross-region solution that allows customers to automatically start and stop EC2 and RDS Instances
https://aws.amazon.com/solutions/implementations/instance-scheduler-on-aws/
Apache License 2.0
542 stars 264 forks source link

ASGHandler - ClientError (ValidationError) when using a high number of configured schedules #549

Closed FugroEgger closed 2 months ago

FugroEgger commented 2 months ago

Describe the bug Stack region: eu-west-1 Stack-Name: cs-instance-scheduler TagName: scheduler_period UsingAWSOrganizations: Yes regions: ap-southeast-1,ap-southeast-2,eu-central-1,eu-west-1,eu-west-3,eu-north-1,me-south-1,me-central-1,us-east-1,us-east-2,us-west-1 63 possible schedules

This ASGHandler (action: scheduler:run) in Hub account shows below error for multiple accounts/regions where ASG is not used (no groups configured). ASG scheduling works if ASG is tagged

Error message

[ERROR] ClientError: An error occurred (ValidationError) when calling the DescribeAutoScalingGroups operation: The request isn't valid. The number of values specified for the tag:scheduler_period filter type exceeds the maximum of 5 tag:scheduler_period.
Traceback (most recent call last):
  File "/var/task/aws_lambda_powertools/logging/logger.py", line 450, in decorate
    return lambda_handler(event, context, *args, **kwargs)
  File "/var/task/instance_scheduler/handler/asg.py", line 68, in lambda_handler
    [num_tagged_auto_scaling_groups, num_schedules] = schedule_auto_scaling_groups(
  File "/var/task/instance_scheduler/handler/asg.py", line 135, in schedule_auto_scaling_groups
    for group in asg_service.get_schedulable_groups(schedule_names):
  File "/var/task/instance_scheduler/service/asg.py", line 263, in get_schedulable_groups
    for page in paginator.paginate(Filters=filters):
  File "/var/lang/lib/python3.11/site-packages/botocore/paginate.py", line 269, in __iter__
    response = self._make_request(current_kwargs)
  File "/var/lang/lib/python3.11/site-packages/botocore/paginate.py", line 357, in _make_request
    return self._method(**current_kwargs)
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)

To Reproduce

fields @timestamp, @message, @logStream, @log
| filter (@message like "ClientError:")
| sort @timestamp desc
| limit 1000

Expected behavior Number of values specified for the tag:scheduler_period should not be limited.

Please complete the following information about the solution:

Screenshots none

Additional context All accounts are managed by Controltower, Minor SCP are active

ASGHandler_call.txt Cloudwatch_error.txt Cloudformation was used to create all schedules Iac_schedule_definition.template.txt

CrypticCabub commented 2 months ago

Thanks for reporting this!

This issue should only occur when creating more than 5 schedules within a short amount of time, I have added it to our backlog to be fixed in an upcoming patch release.

CrypticCabub commented 2 months ago

Fixed in v3.0.1