Closed mrchrisadams closed 2 years ago
Hey @arendjantetteroo can I sanity check my approach for this with you?
We have the notion of HostingProviders
in our system, who have either IP ranges or ASNs allocated to them.
In our case, because we have public IP ranges given here, we'd check the list of Hosting Providers we have, against the region key in this json blob from AWS.
So lets say we wanted to update the AWS Oregon regions (and let's assume for now for this example that us-west-2
really is all Oregon).
If we had this info here:
{
"ip_prefix": "52.95.255.112/28",
"region": "us-west-2",
"service": "AMAZON"
},
We'd find the corresponding hosting provider for Amazon US West (which we do have), and add the IP ranges 52.95.255.112/28
to that hosting provider entity.
Every few days, we'd check against this number and add the new IP ranges, if they change, and we'd run through this process for all the regions that are marked as green/sustainable according to their own info.
Sound about right?
It obviously would be nicer if they updated this stuff themselves, but given their size, this seems a way to at least keep the data as close to what their own data is saying.
Yep, if we make one hosting provider entry per amazon region we can look up all ip adresses and verify they match up with the file and add/remove any entries we are missing or are no longer active.
It does mean that people will see Amazon US West as the provider and not only Amazon, but i think that's a fine tradeoff right now.
This is an acceptable trade-off as Amazon, publicly say different regions run on different kinds of power - doing it this way would allow us to distinguish between them transparently
Hi guys I saw this project in the Climate Action Tech newsletter and I'd like to help here but I'm missing the piece around where Amazon shares information about which areas are green, and how that maps to the IP ranges. Can someone help out with this?
Hi @boxabirds ! In the opening issue I mention this bit here - where Amazon basically say which regions are sustainable (which is about as good as we can get from them):
Amazon have different green and non green regions (snip)
They list 5 regions - US-West (Oregon), Govcloud (US-West, again), Frankfurt, Ireland, and Canada.
I outline above where Amazon list IP ranges for each region, and I've shared the abridged snippet below:
{
{
"ip_prefix": "18.208.0.0/13",
"region": "us-east-1",
"service": "AMAZON"
},
{
"ip_prefix": "52.95.245.0/24",
"region": "us-east-1",
"service": "AMAZON"
},
{
"ip_prefix": "52.194.0.0/15",
"region": "ap-northeast-1",
"service": "AMAZON"
}]
}
Under the hood with the Green Web Foundation, we represent each hosting region, as an entity with a known IP range. So, with the ip ranges above we can have these regions updating automatically.
If you can workout the mappings between the region names to the region codes, then you can you know enough to audit your own infrastructure against a list of regions where Amazon at least make public claims that they are sustainable.
You can see a thread here in more detail about why, but the TLDR version is at the bottom. https://twitter.com/mrchrisadams/status/1184854192428605441
You might wonder why Amazon push this migration cost onto you, where others like GCP and MS take on the cost themselves, by just running on clean power across the board.
That's one to bring up with your AWS rep.
Hi @mrchrisadams & @boxabirds I'm keen on helping out here too, if there's still something to do. From what I understand, we need to:
ip_ranges.json
that find all "green" ip cidr blocks (we assume that a block is green if it belongs to one of the 5 regions that amazon lists as green - assuming we can just hardcode this list of 5 for now as it seems unlikely to change frequently)My questions would be:
HostingProviders
?api.thegreenwebfoundation.org
queries?Basically the database is mysql, there is fair amount of legacy code in php (the api is written in php) and we recently re-wrote the admin part in python. Both the api and admin accesses the same underlying database.
You can find the admin code here https://github.com/thegreenwebfoundation/greenwebfoundation-admin
Both the api and admin are ok places to get this setup, so whatever your choice of programming language fits :)
Eventually the goal would be to have an api for the admin system that other tools could also access to update their information programmatically instead of manually by hand.
Depending on the amount of time and challenge you want you could fit an api on top of the admin system in python and write a script that uses this api to update the ip ranges of amazon (we would need some kind of token to make sure the script can update the specific hosting provider). Or just write a simple script that can directly update the mysql database with insert/update queries.
Feel free to post any questions you have here and we'll try to answer them.
Thanks @jonathan-s @arendjantetteroo
I do have thoughts / questions, but I'd prefer to move the conversation into a chat (slack, gitter, whatever works for you). Not for the sake of realtime communication, but just because the questions will span wider than this specific github issue :)
Do you use any chat platform for this project?
@mrchrisadams i guess we can close this now with #64 done?
hey @arendjantetteroo, Yes - we totally should have closed this a while back.
I'm closing this as we have this working on a daily github action now,and there's an example of the daily import running and being updated at the link below:
https://github.com/thegreenwebfoundation/admin-portal/runs/5531520377?check_suite_focus=true
At the moment, we rely on Amazon being nice enough to update their green regions themselves.
This rarely happens, but they do expose their IP ranges for each region at the the url below
https://ip-ranges.amazonaws.com/ip-ranges.json
More info here:
https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html
Amazon have different green and non green regions, so we might represent the green regions as separate green hosters, or as one huge host, with an absolutely massive set of IP ranges available.
https://aws.amazon.com/about-aws/sustainability/
https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html
AWS's own docs say these change a few times a week, so we'd likely need this running on a cronjob to stay accurate.