fog / fog-aws

Module for the 'fog' gem to support Amazon Web Services http://aws.amazon.com/
MIT License
296 stars 353 forks source link

add missing aws regions #710

Open alex-inqwise opened 3 weeks ago

alex-inqwise commented 3 weeks ago

Problem

The AWS regions are hardcoded in the aws.rb file, causing exceptions for unlisted regions (e.g., il-central-1) with the error message “Unknown region:”.

Solution

Load the list of AWS regions dynamically via the AWS API/CLI and cache it locally to ensure the list is always up-to-date and includes any newly introduced regions.

geemus commented 3 weeks ago

Hey, thanks for the report.

Although dynamically fetching things would avoid it falling out of date, it also would potentially add a lot of network overhead. At least historically we have found that this list changes relatively infrequently, so we have accepted that there is sometimes some lag in updating it in order to avoid the overhead of needing to constantly make this call.

Does that make sense?

I'd certainly welcome additions of any missing regions to get our cache up to date.

alex-inqwise commented 2 weeks ago

Hi,

Thank you for the feedback and for providing context regarding the historical approach to handling AWS regions.

To address the concern about network overhead while ensuring our list of AWS regions is up-to-date, I propose the following approach that ties cache updates to package lifecycle events such as package updates or reinstalls. This minimizes network calls and ensures the list stays current when significant changes are made to the package.

Proposed Solution

1.Cache Initialization on Package Install/Update: 
When the package is installed or updated, initialize or refresh the cache with the latest list of AWS regions using the AWS API/CLI.

2.Deleting Cache for Immediate Updates: 
In scenarios where new regions need to be added urgently, simply delete the cached file. This forces a refresh the next time the cache is accessed.

Benefits

•   Minimized Network Overhead: By tying cache updates to package lifecycle events, we avoid frequent network calls.
•   Up-to-Date Information: The cache is refreshed during package updates or reinstalls, ensuring it remains current.
•   Simple Immediate Updates: For urgent needs, deleting the cache file forces a refresh on the next access, ensuring new regions are included.
•   Stability: Since AWS regions are not added frequently and no regions have been deleted historically, this invalidation approach is suitable for our actual purpose.

Anyway I can manually add any currently missing regions to the cache to ensure it is up-to-date.

Does this approach make sense?

geemus commented 2 weeks ago

Yeah, it's also a bit of a death-by-papercuts problem. When we did it this seemed easiest and good enough, and every individual time it's been easy enough, but in aggregate it is kind of a pain.

That is quite clever and overall I like the idea of it. I do have some pause though, as I feel like network calls at install time is unexpected and could potentially cause issues (ie this might delay deployments if gem install is part of the process). It also feels somewhat risky as if we had bugs at install time it would be very impactful.

Which isn't to say that it might not still be something to do, but I wonder if it might be better/safer to at least start a little more gradually. I think it would be safer (and less surprising) if we provided some manual way to update this cache. That way, if you want you certainly can run this at install time, but if you don't want to (or don't care) it would behave as it currently does. That would also give us the chance to try it out for real before potentially rolling it out more broadly.

Also, this might imply that we would still want to maintain a cache in the gem kind of like we do now, but it would provide a pressure release valve. So when new regions appear people could immediately get access to them and we could still add them, but it wouldn't have the same urgency.

What do you think?

alex-inqwise commented 2 weeks ago

Hi,

Thank you for the thoughtful feedback. I understand the concerns about network calls during install time and the potential impact on deployments. A gradual approach with manual update options seems like a prudent way to address this.

Proposed Revised Solution

  1. Maintain Current Cached Regions: • Continue using the current hardcoded list of AWS regions within the gem as the primary source.
  2. Manual Cache Update Option: • Provide a script or a command that users can run manually to update the cached list of AWS regions. This script can fetch the latest regions from the AWS API/CLI and update the cache file. • Document this manual update process clearly so users know how and when to use it.

Following TYPO:

Manual Cache Update Script:

require 'aws-sdk-ec2'
require 'yaml'

CACHE_FILE = 'aws_regions_cache.yml'

def update_cache
  ec2 = Aws::EC2::Client.new
  regions = ec2.describe_regions.regions.map(&:region_name)
  File.write(CACHE_FILE, regions.to_yaml)
  puts "Cache updated successfully with the latest regions."
end

update_cache if __FILE__ == $PROGRAM_NAME

Using Cached Regions in aws.rb:

require 'yaml'

CACHE_FILE = 'aws_regions_cache.yml'

def get_cached_regions
  if File.exist?(CACHE_FILE)
    YAML.load_file(CACHE_FILE)
  else
    # Fallback to hardcoded regions if cache file doesn't exist
    %w[us-east-1 us-west-1 us-west-2 eu-west-1]
  end
end

regions = get_cached_regions

if regions.include?(desired_region)
  # Proceed with the desired region
else
  raise "Unknown region: #{desired_region}"
end
geemus commented 2 weeks ago

That makes sense.

A couple of suggestions:

Does that make sense? Are you willing to work on a PR related to this?

alex-inqwise commented 2 weeks ago

Sure, here's the revised response:


Hi,

Revised Solution

  1. Maintain Current Cached Regions:

    • Continue using the hardcoded list of AWS regions within the gem.
  2. Manual Cache Update Option:

    • Provide a script or command to manually update the AWS regions cache using Fog.
    • Store the regions in a JSON file for consistency.
    • Document the manual update process clearly.
  3. Explicit Cache Usage Option:

    • Introduce a boolean flag to explicitly designate whether to use the cache, rather than checking for the file's presence.
  4. Command Within Fog for Cache Generation:

    • Add a command within Fog to generate the cache, with an option for scriptable execution.

This revised approach aligns with your suggestions and enhances the solution's robustness and usability. I should note that I am not familiar with Ruby syntax, haven't worked with the Fog library before to get AWS regions, and do not have a debug environment. However, I am willing to work on a PR related to this.

What do you think about these changes?

geemus commented 2 weeks ago

I think that sounds reasonable and in alignment with what we discussed.

It may be challenging to do this without more experience and setup though. The addition of the new region should be simpler and less risky, perhaps we could start there?

alex-inqwise commented 1 week ago

Sure, following the missing regions list:

il-central-1 (Israel Central), eu-north-1 (Stockholm), me-south-1 (Middle East - Bahrain), us-gov-east-1 (AWS GovCloud US East), ca-west-1 (Calgary).

geemus commented 1 week ago

I made a PR (#712) to add these and found only il-central-1 seemed to be missing from the ones you listed. Were you seeing the others being missing as well?

alex-inqwise commented 1 week ago

You are correct; currently, the missing region is il-central-1 (Israel Central). I do not have the latest version of the library.

geemus commented 1 week ago

Ok, thanks for confirming, just wanted to make sure I wasn't missing something.

geemus commented 1 week ago

released in v3.23.0