wellcomecollection / platform-infrastructure

:building_construction: Infrastructure for the Wellcome Digital Platform
MIT License
24 stars 8 forks source link

Lifecycle management for APM traces #408

Closed jamieparkinson closed 9 months ago

jamieparkinson commented 9 months ago

What's changing and why?

We see out-of-storage warnings for the logging cluster sometimes, and usually I go and delete some old APM trace indices to fix them. This is a bad idea.

This PR makes the computer delete the indices for us - the exact configuration will probably require tweaking in future; I've been quite conservative at first so as not to throw away too much potentially useful data. It's done as documented here https://www.elastic.co/guide/en/apm/guide/current/ilm-how-to.html

The component template things already exist (they're managed by Fleet) and I think I will have to import them when I come to apply this.

terraform plan diff

  # elasticstack_elasticsearch_component_template.apm_traces_managed_custom will be created
  + resource "elasticstack_elasticsearch_component_template" "apm_traces_managed_custom" {
      + id   = (known after apply)
      + name = "traces-apm@custom"

      + template {
          + settings = jsonencode(
                {
                  + lifecycle = {
                      + name = "weco-traces-apm"
                    }
                }
            )
        }
    }

  # elasticstack_elasticsearch_component_template.apm_traces_rum_managed_custom will be created
  + resource "elasticstack_elasticsearch_component_template" "apm_traces_rum_managed_custom" {
      + id   = (known after apply)
      + name = "traces-apm.rum@custom"

      + template {
          + settings = jsonencode(
                {
                  + lifecycle = {
                      + name = "weco-traces-apm-rum"
                    }
                }
            )
        }
    }

  # elasticstack_elasticsearch_index_lifecycle.apm_traces will be created
  + resource "elasticstack_elasticsearch_index_lifecycle" "apm_traces" {
      + id            = (known after apply)
      + modified_date = (known after apply)
      + name          = "weco-traces-apm"

      + delete {
          + min_age = "10d"
        }

      + hot {
          + min_age = (known after apply)

          + rollover {
              + max_age  = "30d"
              + max_size = "50gb"
            }
        }
    }

  # elasticstack_elasticsearch_index_lifecycle.apm_traces_rum will be created
  + resource "elasticstack_elasticsearch_index_lifecycle" "apm_traces_rum" {
      + id            = (known after apply)
      + modified_date = (known after apply)
      + name          = "weco-traces-apm-rum"

      + delete {
          + min_age = "10d"
        }

      + hot {
          + min_age = (known after apply)

          + rollover {
              + max_age  = "30d"
              + max_size = "50gb"
            }
        }
    }