klahnakoski / SpotManager

Find cheapest spot instance prices, bid, use, and teardown when done
Mozilla Public License 2.0
16 stars 2 forks source link

SpotManager

The SpotManager is a state-less program meant to be run periodically. It finds the cheapest spot instance prices, bids, sets up the machines, and tears them down when done.

Assumptions

The module assumes your workload is long running and has many save-points.

In my case each machine is setup to pull small tasks off a queue and execute them. These machines can be shutdown at any time; with the most recent task simply placed back on the queue for some other machine to run.

Overview

This library works on a concept of utility, which is an abstract value you assign to each EC2 instance type; the required utility is the primary input used to scale the number and type of instances.

For each instance type (and zone), the SpotManager uses the historical pricing record to figure out a competitive bid (defined by uptime, below). It combines that bid with the utility score for that instance type to get an estimated_value (measured in utility per dollar). The instance types with the best estimated_value, are bid on first.

Requirements

Installation

For now, you must clone the repo

git clone https://github.com/klahnakoski/SpotManager.git

Branches

There are three main branches

Configuration

Each SpotManager instance requires a settings.json file that controls the SpotManager behaviour. We will use the ActiveData ETL settings file as an example to explain the parameters

More about utility

The utility list is a declaration of how much utility each instance type can provide, and additional configuration that the InstanceManager can use for setup().

More about uptime

In order to make a good bid, the historical pricing record for each instance- type and region is used. All these settings have defaults designed for quick- setup tasks. If your setup takes longer, or the value of your machine increases as it sticks around, you may want to set these values. Here are the settings we use for ElasticSearch nodes:

"uptime":{
    "history": "week",
    "duration": "day",
    "bid_percentile": 0.95
}

No matter your uptime settings, your bids will never go beyond your budget, and never go beyond max_utility_price.

Configuring Volumes

Some workloads require large amounts of storage, but not all instances come with enough. The SpotManager will map the ephemeral and EBS volumes or you.

As an example, the c3.4xlarge comes with two ephemeral drives, which can be found at /dev/sdb and two new EBS volumes, which will be assigned device properties at runtime.

    {
        "instance_type": "c3.4xlarge",
        "utility": 15,
        "drives": [
            {"path":"/data1", "device":"/dev/sdb"},
            {"path":"/data2", "device":"/dev/sdc"},
            {"path":"/data3", "size":1000, "volume_type":"standard"},
            {"path":"/data4", "size":1000, "volume_type":"standard"}
        ]
    },

Some caveats:

Writing a InstanceManager

Conceptually, an instance manager is very simple, with only three methods you need to implement. This repo has an example ./examples/etl.py that you can review.

Benefits

The benefit of an bid_percentile price point is we want a reasonable up-time with a low price. We do not want a price set too high: we desire Amazon-initiated termination so we get the last partial hour free. Also, some of instance types have unpredictable and extreme price swings; SpotManager allows you to utilize those valleys at minimal price exposure.

The more instance types your workload can run on, the more advantage you have finding minimal pricing: Anecdotally, there is always an opportunity to be found: There is always an instance type going for significantly less than its utility would indicate.