Use datastreams to store metrics

So far we have used a single index to store k6 metrics. However, datastreams are preferable for a couple of reasons, one of them being able define a retention period via an ILM policy. We should therefore move away from the single index and instead create a datastream. Storing data in an index will not be supported anymore.

Datastream Details

Name: metrics-k6-default. The namespace - here: default - is overridable via a new setting K6_ELASTICSEARCH_DATASTREAM_NAMESPACE. The existing setting K6_ELASTICSEARCH_INDEX_NAME will be removed. It will not be possible to override the entire datastream name. This would increase complexity significantly as we would need to make sure that the index template matches the chosen datastream name. Note: We might allow overriding the entire datastream name in the future if and only if the datastream is managed by the user.
ILM policy: We specify a default policy without a retention period but allow to override it.

These calls will be issued internally:

PUT /_ilm/policy/metrics-k6
{
  "phases": {
    "hot": {
      "actions": {
        "rollover": {
          "max_primary_shard_size": "50gb",
          "min_docs": 1
        },
        "set_priority": {
          "priority": 100
        },
        "readonly": {}
      }
    }
  },
  "_meta": {
    "description": "default policy for k6 metrics",
    "managed": true,
    "version": 1
  }
}

PUT /_component_template/metrics-k6
{
  "template": {
    "settings": {
      "index": {
        "number_of_shards": 1,
        "number_of_replicas": 0,
        "auto_expand_replicas": "0-1"
      },
      "codec": "best_compression"
    },
    "mappings": {
      "_meta": {
        "index-template-version": 1,
        "managed": true
      },
      "date_detection": false,
      "dynamic_templates": [
        {
          "strings": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ],
      "_source": {
        "enabled": true
      },
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "Value": {
          "type": "double"
        }
      },
      "version": 1
    }
  }
}

Note: previously the timestamp field was called Time. We can rename this in a reindex script.

PUT /_component_template/metrics-k6-ilm
{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "metrics-k6"
        }
      }
    }
  },
  "_meta": {
    "index-template-version": 1,
    "managed": true
  },
  "version": 1
}

PUT /_index_template/metrics-k6
{
  "index_patterns": [
    "metrics-k6-*"
  ],
  "data_stream": {},
  "composed_of": [
    "metrics-k6",
    "metrics-k6-ilm",
    "metrics-k6-ilm@custom"
  ],
  "ignore_missing_component_templates": [
    "metrics-k6-ilm@custom"
  ],
  "priority": 100,
  "_meta": {
    "description": "index template for k6 metrics",
    "managed": true
  },
  "version": 1
}

Behavior for existing installations

When the k6-metrics index exists, we can issue a warning that the index pattern has changed. This is only best effort and won't catch cases where users have overridden the index name though.

Migration

We won't automatically migrate data but can provide a reindex and cleanup script that users can execute if required.

Permissions

We might need to adapt the initial permission check as the output extension needs to create a datastream and associated ILM policy. Finally, we should allow to make this process optional as advanced users might want to create the datastream themselves and tighten the cluster permissions of the k6 user to allow only write access. This behavior will be controlled by the flag K6_ELASTICSEARCH_AUTOCREATE_DATASTREAM which is true by default. If it set to false, the output extension assumes that the datastream is already setup properly (without any further checks).

elastic / xk6-output-elasticsearch