spotinst / terraform-spotinst-ocean-aws-k8s

A Terraform module to create an Ocean Cluster.
Apache License 2.0
4 stars 5 forks source link
netapp-public ocean owned-by-ziv-messing

Spot Ocean k8s Terraform Module

Spotinst Terraform Module to integrate existing k8s with Ocean

Prerequisites

Installation of the Ocean controller is required by this resource. You can accomplish this by using the spotinst/terraform-ocean-kubernetes-controller module. The kubernetes provider will need to be initialized before calling the kubernetes-controller module as follows:

terraform {
  required_providers {
    spotinst = {
      source = "spotinst/spotinst"
    }
  }
}

provider "spotinst" {
  token   = "redacted"
  account = "redacted"
}

module "ocean-aws-k8s" {
  source  = "spotinst/ocean-aws-k8s/spotinst"
  ...
}

# Data Resources for kubernetes provider
data "aws_eks_cluster" "cluster" {
  name    = "cluster name"
}
data "aws_eks_cluster_auth" "cluster" {
  name    = "cluster name"
}
provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}
##################

module "kubernetes-controller" {
  source = "spotinst/kubernetes-controller/ocean"

  # Credentials.
  spotinst_token   = "redacted"
  spotinst_account = "redacted"

  # Configuration.
  tolerations = []
  cluster_identifier = "cluster name"
}

~> You must configure the same cluster_identifier both for the Ocean controller and for the spotinst_ocean_aws resource. The ocean-aws-k8s module will use the cluster name as the identifier. Ensure this is also used in the controller config

Usage

module "ocean-aws-k8s" {
  source = "spotinst/ocean-aws-k8s/spotinst"

  # Configuration
  cluster_name = "Sample-EKS"
  region = "us-west-2"
  subnet_ids = ["subnet-12345678","subnet-12345678"]
  #Fetch the instance profile from existing eks managed node group IAM role
  worker_instance_profile_arn = tolist(data.aws_iam_instance_profiles.profile.arns)[0]
  security_groups = ["sg-123456789","sg-123456789"]
  should_tag_volumes = true
  health_check_unhealthy_duration_before_replacement = 60

  # Shutdown hours block
  shutdown_hours = {
  is_enabled = false
  time_windows = ["Sat:08:00-Sun:08:00"]
  }

  # Scheduling tasks parameters block (amiAutoUpdate and clusterRoll)
  tasks = [
    {
      is_enabled = false
      cron_expression = "0 9 * * *"
      task_type = "amiAutoUpdate"
       ami_auto_update = [{
          apply_roll = false
          minor_version = false
          patch = true
          ami_auto_update_cluster_roll = [{
             batch_min_healthy_percentage = 50
             batch_size_percentage = 20
             comment = "Comments for AmiAutUpdate Cluster Roll"
             respect_pdb = true
          }]
       }]
    },
   {
       is_enabled = false
       cron_expression = "0 5 * * *"
       task_type = "clusterRoll"
       parameters_cluster_roll = [{
          batch_min_healthy_percentage = 50
          batch_size_percentage = 20
          comment = "Comments for Parameters Cluster Roll"
          respect_pdb = false
       }]
    }
  ]

  # Overwrite Name Tag and add additional
  tags = {Name = "Ocean-Nodes", CreatedBy = "Terraform"}

  # Block Device Mappings
  block_device_mappings = [
  {
    device_name = "/dev/xvda"
    delete_on_termination = false
    encrypted = true
    kms_key_id = "alias/aws/ebs"
    snapshot_id = null
    volume_type = "gp3"
    volume_size = null
    throughput = 125
    dynamic_volume_size = [{
        base_size = 30
        resource = "CPU"
        size_per_resource_unit = 25
    }]
    dynamic_iops = [{
        base_size = 20
        resource = "CPU"
        size_per_resource_unit = 12
    }]
  },
  {
     device_name = "/dev/xvda"
     encrypted = true
     iops = 100
     volume_type = "gp3"
     dynamic_volume_size = [{
        base_size = 50
        resource = "CPU"
        size_per_resource_unit = 20
     }]
  }
  ]
}

data "aws_iam_instance_profiles" "profile" {
  depends_on = [module.eks]
  role_name = module.eks.eks_managed_node_groups["one"].iam_role_name
}

Modules

Documentation

If you're new to Spot and want to get started, please checkout our Getting Started guide, available on the Spot Documentation website.

Getting Help

We use GitHub issues for tracking bugs and feature requests. Please use these community resources for getting help:

Community

Contributing

Please see the contribution guidelines.

Requirements

Name Version
terraform >= 1.6.3
aws >= 3.70
spotinst >= 1.139

Providers

Name Version
aws >= 3.70
spotinst >= 1.139

Modules

No modules.

Resources

Name Type
spotinst_ocean_aws.ocean resource
aws_ami.eks_worker data source
aws_default_tags.default_tags data source
aws_eks_cluster.cluster data source
aws_eks_cluster_auth.cluster data source

Inputs

Name Description Type Default Required
ami_id The image ID for the EKS worker nodes. If none is provided, Terraform will search for the latest version of their EKS optimized worker AMI based on platform string null no
associate_ipv6_address (Optional, Default: false) Configure IPv6 address allocation. bool false no
associate_public_ip_address (Optional, Default: false) Configure public IP address allocation. bool false no
auto_apply_tags Default: false. Will update instance tags on the fly without rolling the cluster. bool null no
auto_headroom_percentage Set the auto headroom percentage (a number in the range [0, 200]) which controls the percentage of headroom from the cluster. number 5 no
autoscale_cooldown Cooldown period between scaling actions. number null no
autoscale_is_auto_config Automatically configure and optimize headroom resources. bool true no
autoscale_is_enabled Enable the Ocean Kubernetes Auto Scaler. bool true no
availability_vs_cost (Optional, Default: balanced) You can control the approach that Ocean takes while launching nodes by configuring this value. Possible values: costOriented,balanced,cheapest. string "balanced" no
batch_min_healthy_percentage Default: 50. Indicates the threshold of minimum healthy instances in single batch. If the amount of healthy instances in single batch is under the threshold, the cluster roll will fail. If exists, the parameter value will be in range of 1-100. In case of null as value, the default value in the backend will be 50%. Value of param should represent the number in percentage (%) of the batch. number null no
batch_size_percentage Sets the percentage of the instances to deploy in each batch. number 20 no
blacklist List of instance types not allowed in the Ocean cluster (whitelist and blacklist are mutually exclusive) list(string) null no
cluster_name Cluster name string n/a yes
conditioned_roll Default: false. Spot will perform a cluster Roll in accordance with a relevant modification of the cluster’s settings. When set to true , only specific changes in the cluster’s configuration will trigger a cluster roll (such as AMI, Key Pair, user data, instance types, load balancers, etc). bool null no
conditioned_roll_params (Optional) A custom list of attributes will trigger the cluster roll operation (overrides the predefined list of parameters). Valid only when the conditioned_roll parameter is set to true. list(string) null no
controller_id Unique identifier for the Ocean controller. If not specified the cluster name will be used. string null no
cpu_per_unit Optionally configure the number of CPUs to allocate the headroom. CPUs are denoted in millicores, where 1000 millicores = 1 vCPU. number null no
data_integration_id The identifier of The S3 data integration to export the logs to. string null no
desired_capacity The number of worker nodes to launch and maintain in the Ocean cluster number 1 no
draining_timeout Draining timeout before terminating a node number 120 no
ebs_optimized launch specification defined on the Ocean object will function only as a template for virtual node groups. bool false no
enable_automatic_and_manual_headroom Default: false. Enables automatic and manual headroom to work in parallel. When set to false, automatic headroom overrides all other headroom definitions manually configured, whether they are at cluster or VNG level. bool null no
extended_resource_definitions List of Ocean extended resource definitions to use in this cluster. list(string) null no
fallback_to_ondemand Launch On-Demand in the event there are no EC2 spot instances available bool true no
filters List of filters. The Instance types that match with all filters compose the Ocean's whitelist parameter. Cannot be configured together with whitelist/blacklist.
object({
architectures = list(string)
categories = list(string)
disk_types = list(string)
exclude_families = list(string)
exclude_metal = bool
hypervisor = list(string)
include_families = list(string)
is_ena_supported = bool
max_gpu = number
min_gpu = number
max_memory_gib = number
max_network_performance = number
max_vcpu = number
min_enis = number
min_memory_gib = number
min_network_performance = number
min_vcpu = number
root_device_types = list(string)
virtualization_types = list(string)
})
null no
gpu_per_unit Optionally configure the number of GPUs to allocate the headroom. number null no
grace_period The amount of time, in seconds, after the instance has launched to start checking its health. number 600 no
http_put_response_hop_limit An integer from 1 through 64. The desired HTTP PUT response hop limit for instance metadata requests. The larger the number, the further the instance metadata requests can travel. number 1 no
http_tokens Determines if a signed token is required or not. Valid values: optional or required. string "optional" no
key_name The key pair to attach the instances. string null no
launch_spec_ids List of virtual node group identifiers to be rolled. list(string) null no
load_balancer load_balancer object
list(object({
arn = string
name = string
type = string
}))
null no
max_memory_gib The maximum memory in GiB units that can be allocated to the cluster. number 100000 no
max_scale_down_percentage Would represent the maximum % to scale-down. Number between 1-100. number 10 no
max_size The upper limit of worker nodes the Ocean cluster can scale up to number 1000 no
max_vcpu The maximum cpu in vCPU units that can be allocated to the cluster. number 20000 no
memory_per_unit Optionally configure the amount of memory (MB) to allocate the headroom. number null no
min_size The lower limit of worker nodes the Ocean cluster can scale down to number 1 no
monitoring Enable detailed monitoring for cluster. Flag will enable Cloud Watch detailed monitoring (one minute increments). Note: there are additional hourly costs for this service based on the region used. bool false no
num_of_unit The number of units to retain as headroom, where each unit has the defined headroom CPU and memory. number null no
region The region the cluster is located string n/a yes
respect_pdb Default: false. During the roll, if the parameter is set to True we honor PDB during the instance replacement. bool null no
root_volume_size The size (in Gb) to allocate for the root volume. Minimum 20. number null no
security_groups One or more security group ids. list(string) n/a yes
should_roll Should the cluster be rolled for configuration updates string false no
shutdown_hours shutdown_hours object
object({
is_enabled = bool
time_windows = list(string)
})
null no
spot_percentage The % of the cluster should be running on Spot vs OD. 100 means 100% of the cluster will be ran on Spot instances number null no
spread_nodes_by (Optional, Default: count) Ocean will spread the nodes across markets by this value. Possible values: vcpu or count. string "count" no
subnet_ids List of subnet IDs list(string) n/a yes
tags Additional Tags to be added to resources map(string) null no
tasks task object
list(object({
is_enabled = bool
cron_expression = string
task_type = string
ami_auto_update = set(object({
apply_roll = bool
minor_version = bool
patch = bool
ami_auto_update_cluster_roll = set(object({
batch_min_healthy_percentage = number
batch_size_percentage = number
comment = string
respect_pdb
})), [])
})), [])
parameters_cluster_roll = set(object({
batch_min_healthy_percentage = number
batch_size_percentage = number
comment = string
respect_pdb
})), [])
}))
null no
use_as_template_only launch specification defined on the Ocean object will function only as a template for virtual node groups. bool false no
user_data n/a string null no
utilize_commitments If savings plans commitment has available capacity, Ocean will utilize them alongside RIs (if exist) to maximize cost efficiency. bool false no
utilize_reserved_instances If there are any vacant Reserved Instances, launch On-Demand to consume them bool true no
whitelist List of instance types allowed in the Ocean cluster (whitelist and blacklist are mutually exclusive) list(string) null no
worker_instance_profile_arn The instance profile iam role. string n/a yes
block_device_mappings block_device_mapping object
list(object({
device_name = string
delete_on_termination = bool
encrypted = bool
kms_key_id = string
snapshot_id = string
volume_type = string
iops = number
volume_size = number
throughput = number
dynamic_iops = set(object({
base_size = number
resource = string
size_per_resource_unit = number
})), [])
dynamic_volume_size = set(object({
base_size = number
resource = string
size_per_resource_unit = number
})), [])
}))
[] no
should_tag_volumes (Optional, Default: false) Specify if Volume resources will be tagged with Virtual Node Group tags or Ocean tags. bool false no
is_aggressive_scale_down_enabled (Optional, Default: false) When set to true, Enables and customize the Aggressive Scale Down feature. This allows nodes to be promptly scaled down by the Ocean Autoscaler as soon as they become eligible, without any waiting period. bool false no
health_check_unhealthy_duration_before_replacement The amount of time, in seconds, an existing instance should remain active after becoming unhealthy. After the set time out the instance will be replaced. The minimum value allowed is 60, and it must be a multiple of 60. number 120 no

Outputs

Name Description
ocean_controller_id The Ocean controller ID
ocean_id The Ocean cluster ID