aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.09k stars 4.01k forks source link

Health uses invalid endpoints when region is set #8745

Open bryanhiestand opened 2 weeks ago

bryanhiestand commented 2 weeks ago

Describe the bug

When the region is set through either AWS_REGION or AWS_DEFAULT_REGION environment variables or with the --region argument passed to aws health, the CLI attempts to use an endpoint in that region. This fails because AWS Health only offers endpoints in us-east-1 and us-east-2.

Expected Behavior

health should only use valid endpoints

Current Behavior

Having the AWS_REGION environment variable set to anything other than us-east-1 or us-east-2 results in an error. Examples:

invalid region

AWS_REGION=nope aws health describe-events

Could not connect to the endpoint URL: "https://health.nope.amazonaws.com/"

valid region lacking Health endpoints

AWS_REGION=us-west-2 aws health describe-events

Could not connect to the endpoint URL: "https://health.us-west-2.amazonaws.com/"

Passing --region for anything other than us-east-1 or us-east-2 results in an error

aws health describe-events --region us-west-2

Could not connect to the endpoint URL: "https://health.us-west-2.amazonaws.com/"

Having AWS_DEFAULT_REGION set to anything other than us-east-1 or us-east-2 results in an error

AWS_DEFAULT_REGION=us-west-2 aws health describe-events

Could not connect to the endpoint URL: "https://health.us-west-2.amazonaws.com/"

Reproduction Steps

Install a current version of aws-cli, set us-west-2 as either the AWS_REGION or AWS_DEFAULT_REGION environment variable or with --region, and

AWS_DEFAULT_REGION=us-west-2 aws health describe-events

You should see

Could not connect to the endpoint URL: "https://health.us-west-2.amazonaws.com/"

We reproduced on

aws-cli/2.13.34 Python/3.11.6 Darwin/23.5.0 source/arm64 prompt/off
aws-cli/2.16.7 Python/3.11.9 Darwin/23.4.0 source/arm64
aws-cli/1.22.34 Python/3.10.12 Linux/6.2.0-1018-aws botocore/1.23.34

This behavior appears to have been introduced between 2.9.21 and 2.16.7, as the following version works as expected

aws-cli/2.9.21 Python/3.9.11 Darwin/22.6.0 exe/x86_64 prompt/off

Possible Solution

It looks like AWS only has multi-region support for active-passive failover. If so, would it make sense to only use global.health.amazonaws.com?

Otherwise, this could be part of improved region validation, or aws-cli could only use the two valid regions

Additional Information/Context

This may be a regression of https://github.com/aws/aws-cli/issues/4183 and may be related to https://github.com/aws/aws-cli/issues/2266

I am wondering if this bug might have been introduced when boto added regional support for health

CLI version used

2.16.7

Environment details (OS name and version, etc.)

Darwin/23.4.0 source/arm64 and Linux/6.2.0-1018-aws

Edit: fixed some typos/bad grammar, meaning unchanged

RyanFitzSimmonsAK commented 2 weeks ago

Hi @bryanhiestand, thanks for reaching out. This behavior makes sense; AWS Health only has two endpoints as you mentioned, and you should make requests to those endpoints. What behavior were you anticipating in the event of a request being sent to an endpoint that isn't supported?

bryanhiestand commented 2 weeks ago

Hi @bryanhiestand, thanks for reaching out. This behavior makes sense; AWS Health only has two endpoints as you mentioned, and you should make requests to those endpoints. What behavior were you anticipating in the event of a request being sent to an endpoint that isn't supported?

Hi @RyanFitzSimmonsAK, thank you for taking a look!

I agree that it's a bit silly to expect aws health describe-events --region nope to work, and I would expect passing an invalid --endpoint-url to break things.

Since Health appears to be a global service, I would actually expect to not have to specify a region, similar to the way IAM works. So I would expect to be able to have the AWS_DEFAULT_REGION environment variable set and still be able to access health. I would also expect to be able to access health while having region environment variables set or after defining a non-us-east region in aws configure.

This isn't quite what I am trying to do, but perhaps this untested script will illustrate what I am getting at. If one wanted to see if any of their instances tagged environment=production had health events, they might try a script like this, running it with the AWS_REGION environment variable for each of their regions:

# List EC2 instances tagged with "environment=production"
INSTANCE_IDS=$(aws ec2 describe-instances  --filter "Name=tag:environment,Values=production" --query 'Reservations[*].Instances[*].[InstanceId]' --output json | jq -r '.[][]')

# Get all upcoming AWS health events
EVENTS=$(aws health describe-events --filter eventStatusCodes=upcoming,eventTypeCodes=AWS_EC2_INSTANCE_MAINTENANCE_SCHEDULED --query 'events[*].arn' --output json | jq -r '.[]')

for event in $EVENTS
do
  # Get details for each event
  EVENT_DETAILS=$(aws health describe-event-details --filter eventArns=$event --query 'successfulSet[*].eventDescription.latestDescription' --output text)

  for id in $INSTANCE_IDS
  do
    # Check if the event details contain the instance ID
    if echo "$EVENT_DETAILS" | grep -q "$id"; then
      echo "Scheduled events for instance $id: $EVENT_DETAILS"
    fi
  done
done

However, the above will only work in us-east-1 or us-east-2. For any other region, we must override the region only when calling health. This seems like either a burdensome design or a bug:

export AWS_REGION=us-west-2
# List EC2 instances tagged with "environment=production"
INSTANCE_IDS=$(aws ec2 describe-instances  --filter "Name=tag:environment,Values=production" --query 'Reservations[*].Instances[*].[InstanceId]' --output json | jq -r '.[][]')

# some more regional commands

# Health is only available in us-east-1
aws health describe-event-details --region us-east-1 ...

With IAM, this would not be required:

export AWS_REGION='us-west-2'

# Get all IAM instance profiles
ALL_ROLES=$(aws iam list-instance-profiles --query 'InstanceProfiles[*].Roles[*].RoleName' --output json | jq -r '.[][]')

# Get roles used by EC2 instances in the specified region
USED_ROLES=$(aws ec2 describe-instances --query 'Reservations[*].Instances[*].IamInstanceProfile.Arn' --output json | jq -r '.[][]' | awk -F '/' '{print $2}')

I hope this helps clarify. Thanks again for looking.