vantage-sh / ec2instances.info

Amazon EC2 instance comparison site
https://ec2instances.info
MIT License
5.15k stars 583 forks source link

T3, R5, R5D, Z1D, and other instances missing pricing data due to outdated pricing json files #367

Closed earthboundkid closed 5 years ago

earthboundkid commented 6 years ago

https://aws.amazon.com/blogs/aws/new-t3-instances-burstable-cost-effective-performance/

kcroaker commented 6 years ago

I believe that this just needs to be redeployed to scrape the latest data from the pages. I've checked the sources it scrapes and they've been updated to show the new T3 instance types

cristim commented 6 years ago

@powdahound please rebuild to include these new instance types, I'd like to make use of this in my AutoSpotting project that uses this data

cristim commented 6 years ago

thanks @powdahound!

cristim commented 6 years ago

I had a look and we're missing pricing information for these.

jnerin commented 6 years ago

Hi, the vCPU info is incorrect for T3, according to https://aws.amazon.com/blogs/aws/new-t3-instances-burstable-cost-effective-performance/ and https://aws.amazon.com/ec2/instance-types/t3/:

Name vCPUs t3.nano 2 t3.micro 2 t3.small 2 t3.medium 2 t3.large 2 t3.xlarge 4 t3.2xlarge 8

Kind regards.

powdahound commented 6 years ago

The pricing data isn't in the (deprecated) json files we're looking in. Not sure if it is likely to be added or not. I don't have time to work on using an alternative source right now so am happy to wait and see but am also curious to hear others' thoughts.

loadedmind commented 6 years ago

Pricing page: https://aws.amazon.com/ec2/pricing/on-demand/

cristim commented 6 years ago

@loadedmind for pricing we don't use the data from those tables, but instead the JSON file mentioned by @powdahound (and a few other similar ones) which AWS often forgets to update when launching new instance types.

These cause lots of data issues for new instance types, and also sometimes break for legacy types. We often report these to AWS and they eventually fix, but ideally we should switch the code to use the current pricing API which should improve the situation at least for the pricing part. Have a look at #139 and #321 for more context on this work.

Unfortunately this is a significant change and we don't have bandwidth to do it anytime soon, but hopefully someone else can step up and contribute it (hint, hint)

tootedom commented 6 years ago

Hola,

I asked amazon (via our support), if they could update the legacy json file. Unfortunately, not good news. Maybe you've gotten this response before, and some how managed to get it updated. If so, let me know and I can respond to the support case with the magic words:

cheers /dom

Here's the response from AMZ:


Thanks for your patience.

I understood your requirement and Use-case for the update of the deprecated link shared by you. I had thereby gone and shared your Use-Case with the Internal Pricing Team to check whether they could update the specific deprecated link with pricing for t3 as the very useful site: https://www.ec2instances.info/ uses this link.

However, they did discuss between the multiple teams concerned with the update of this pricing json and have unfortunately confirmed with me that it would not be possible to update it. This is also the reason why they changed the AWS marketing site to use the PriceList API. The have also mentioned that these jsons that were previously used to display prices on the marketing site are no longer maintained and were never intended for public use.

Unfortunately, The site owners are going to need to update to pull from the Public PriceList API.

I really regret the inconvenience caused to you. Please feel free to get back to me for any follow up queries.

Have a nice day!


cristim commented 6 years ago

@tootedom I wish Amazon would assign someone to help us maintain this website, they benefit a lot from it indirectly, we're just doing it on the side and have nowhere near their resources.

But as I mentioned in #360, this is already in the works:

I'm working on replacing the price scraping logic with a small wrapper around https://github.com/lyft/awspricing, looking good so far.

I played with it a bit and I'll try to work on this more tomorrow, I hope I can integrate it within the next few weeks.

nvivo commented 5 years ago

@powdahound, @cristim Since 2015, the new API for pricing have different urls than the ones used by this website: https://aws.amazon.com/blogs/aws/new-aws-price-list-api/

This (400+mb) json seems to contain prices for all instances in all regions. https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json

The API also has jsons per region, and an index with offer versions that can be used to check for updates.

I know this implies some work to be done, but I believe the way to go is to use the new files instead of asking amazon to update the deprecated ones.

cristim commented 5 years ago

@nvivo, I am familiar with it and I actually have a small prototype that uses those APIs, and planning to write a replacement for the current scraping script based on it.

I am just struggling to find enough time to work on this stuff.

If someone else is able to spend a few days on it, I can give some pointers.

blckct commented 5 years ago

@cristim Please share what you have thus far. If I understand the problem correctly it would be relatively easy to bolt on something to provide the pricing data for missing instances but it would be quite hard to switch entirely to new API (and not really possibly for previous generation).

cristim commented 5 years ago

@blckct I had played with the CLI and then with the https://github.com/lyft/awspricing library.

Then I asked a colleague to look over using the API directly, and he came up with a small Python program that can query prices which you can see below.

I am now in the process to implement a similar logic in Golang for my AutoSpotting project, which is actually what drives my interest in this in the first place, and eventually I hoped I could get some time to contribute the pricing change to ec2instances.info.

Here's what I have so far, let me know if you have any questions:

AWS cli example - in a nutshell this is what we need to call, and process the results, the Python code does this much nicer and has some more cool stuff, see below.

aws --region us-east-1 pricing get-products  --service-code AmazonEC2 --filters Type=TERM_MATCH,Field=instanceType,Value=c3.large Type=TERM_MATCH,Field=termType,Value=OnDemand 

lyft/awspricing have a look at the example at https://github.com/lyft/awspricing#usage on top of that I just enabled caching by setting the env var it expects for controlling caching, because it has caching support and does a lot of heavylifting for us which I found nice.

Pricing API example I got from my colleague

#!/usr/bin/env python3
"Awesome stuff!"

import argparse
import json
import boto3

DEFAULTS = {
    "tenancy": "shared",
    "operatingSystem": "Linux",
    "preInstalledSw": "NA"
}

OUTPUT_FORMAT = ", ".join([
    "{it}", "{mem}", "{cpu}", "{sto}",
    "{loc}: {price} ({desc})"
])

def parse():
    "Parsing function."
    pricing_client = boto3.client('pricing')
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )
    service_pager = pricing_client.get_paginator("describe_services")
    service_iterator = service_pager.paginate(ServiceCode='AmazonEC2')
    attributes = list(service_iterator)[0].get(
        'Services')[0].get('AttributeNames')
    for attribute in attributes:
        parser.add_argument(
            "--{}".format(attribute),
            help="Adds {} attribute to filters.".format(attribute),
            nargs="?", const="ask", default=DEFAULTS.get(attribute)
        )
    parser.add_argument(
        "-v", "--verbose", action="count", default=0,
        help="Dumps debug information.")
    return parser.parse_args()

def ask_value(attribute):
    "Asks for attribute values."
    pricing_client = boto3.client('pricing')
    attribute_pager = pricing_client.get_paginator('get_attribute_values')
    attribute_iterator = attribute_pager.paginate(
        ServiceCode='AmazonEC2', AttributeName=attribute)
    print("Possible values for attribute {} are: ".format(attribute))
    for attribute_item in attribute_iterator:
        attribute_values = attribute_item.get('AttributeValues')
        for attribute_value in attribute_values:
            value = attribute_value.get('Value')
            print(value)
    value = input("Enter value for attribute {}: ".format(attribute))
    return value

def filters_from_args(args):
    "Does filter..."
    product_filter = []
    for key, value in vars(args).items():
        if key == "verbose":
            continue
        if value:
            if value == "ask":
                value = ask_value(key)
            if not value:
                continue
            product_filter.append(
                {
                    'Type': 'TERM_MATCH',
                    'Field': key,
                    'Value': value
                }
            )
    return product_filter

def main():
    "The main function."
    args = parse()
    pricing_client = boto3.client('pricing')
    product_pager = pricing_client.get_paginator('get_products')
    product_iterator = product_pager.paginate(
        ServiceCode='AmazonEC2', Filters=filters_from_args(args))

    for product_item in product_iterator:
        for offer_string in product_item.get('PriceList'):
            offer = json.loads(offer_string)
            product = offer.get('product')
            if args.verbose:
                print(json.dumps(product, indent=2))
            product_attributes = product.get('attributes')
            output = OUTPUT_FORMAT.format(
                it=product_attributes.get('instanceType'),
                mem=product_attributes.get('memory'),
                cpu=product_attributes.get('vcpu'),
                sto=product_attributes.get('storage'),
                loc=product_attributes.get('location'),
                price='{price}',
                desc='{desc}'
            )
            terms = offer.get('terms')
            if args.verbose:
                print(json.dumps(terms, indent=2))
            ondemand_terms = terms.get('OnDemand', {})
            for ondemand_term in ondemand_terms.keys():
                price_dimensions = ondemand_terms.get(
                    ondemand_term).get('priceDimensions')
                for price_dimension in price_dimensions.keys():
                    price = price_dimensions.get(
                        price_dimension).get('pricePerUnit').get('USD')
                    description = price_dimensions.get(
                        price_dimension).get('description')
                    print(output.format(price=price, desc=description))
    return

if __name__ == "__main__":
    main()
blckct commented 5 years ago

@cristim I tried the https://github.com/lyft/awspricing but it seems that due to collision removal it's impossible to find t2.nano with Linux in us-east-1.