appuio / terraform-openshift4-exoscale

OpenShift 4 Terraform setup for Exoscale
2 stars 0 forks source link
appuio projectsyn terraform vshn-project-ocp

OpenShift 4 on Exoscale

:warning: WIP: This is still a work in progress and will change!

This repository provides a Terraform module to provision the infrastructure for an OpenShift 4 cluster on Exoscale.

Please see the VSHN OCP4 on Exoscale install how-to for a step-by-step installation guide.

Overview

The Terraform module in this repository provisions all the infrastructure which is required to setup an OpenShift 4 cluster on Exoscale using UPI (User-provisioned infrastructure).

The module manages the VMs (including their Ignition or cloud-init config), DNS zone and records, security groups, and floating IPs for a highly-available OpenShift 4 cluster.

By default, the module will provision all the VMs with public IPs (the default on Exoscale), and restricts access to the cluster VMs using Exoscale's security group mechanism. Out of the box, all the cluster VMs (which use RedHat CoreOS) are reachable over SSH for debugging purposes using a SSH key which is provided during provisioning.

The module expects that a suitable RHCOS VM template is available in the Exoscale organisation and region in which the cluster is getting deployed.

The module also provisions a pair of load balancer VMs. The module uses vshn-lbaas-exoscale to provision the LBs.

Module input variables

The module provides variables to

The cluster's domain is constructed from the provided base domain, cluster id and cluster name. If a cluster name is provided the cluster domain is set to <cluster name>.<base domain>. Otherwise the cluster domain is set to <cluster id>.<base domain>.

Configuring additional worker groups

Please note that you cannot use names "master", "infra", "worker" or "storage" for additional worker groups. We prohibit these names to ensure there are no collisions between the generated nodes names for different worker groups.

As the examples show, attributes disk_size, state and affinity_group_ids for entries in additional_worker_groups are optional. If these attributes are not given, the nodes are deployed with disk_size = var.root_disk_size, state = "Running" and affinity_group_ids = [].

To configure an additional worker group named "cpu1" with 3 instances with type "CPU-huge" the following input can be given:

# File main.tf
module "cluster" {
  // Remaining config for module omitted

  additional_worker_groups = {
    "cpu1": {
      size: "CPU-huge"
      count: 3
    }
  }
}

To configure an additional worker group named "storage1" with 3 instances with type "Storage-huge", and 5120GB of total disk size (120GB root disk + 5000GB data disk), the following input can be given:

# File main.tf
module "cluster" {
  // Remaining config for module omitted

  additional_worker_groups = {
    "storage1": {
      size: "Storage-huge"
      count: 3
      data_disk_size: 5000
    }
  }
}

Required credentials

VSHN service dependencies

Since the module manages a VSHN-specific Puppet configuration for the LB VMs, it needs access to some https://www.vshn.ch[VSHN] infrastructure:

Using the module outside VSHN

If you're interested in a version of the module which doesn't include VSHN-managed LBs, you can check out the standalone MVP LB configuration in commit 172e2a0.

:warning: Please note that we're not actively developing the MVP LB configuration at the moment.

Optional features

Private network

:warning: This mode is less polished than the default mode and we're currently not actively working on improving this mode.

Optionally, the OpenShift 4 cluster VMs can be provisioned solely in an Exoscale managed private network. To use this variation, set module variable use_privnet to true. If required, you can change the CIDR of the private network by setting variable privnet_cidr.

When deploying the RHCOS VMs with a private network only, the VMs must first be provisioned in Stopped state, and then powered on in a subsequent apply step. Otherwise, the initial Ignition config run fails because the Ignition API is not reachable early enough in the boot process, as the network interface is also configured by Ignition in this scenario. This can be achieved by running the following sequence of terraform apply steps. The example assumes that the LBs and bootstrap node have been provisioned correctly already and that we're now provisioning the OCP4 master VMs.

for state in "stopped" "running" "running"; do
  cat >override.tf <<EOF
  module "cluster" {
    bootstrap_count = 1
    infra_count = 0
    worker_count = 0
    master_state = "${state}"
  }
  terraform apply
done

Note: the second terraform apply with state = "Running" may not be required in all cases, but is there as a safeguard if creation of DNS records fails in the first terraform apply with `state = "Running".