grafana / xk6-output-prometheus-remote

k6 extension to output real-time test metrics using Prometheus Remote Write.
GNU Affero General Public License v3.0
156 stars 72 forks source link

K6 Remote write took 5.1603032s while flush period is 1s. Some samples m ay be dropped. #36

Closed perrinj3 closed 2 years ago

perrinj3 commented 2 years ago

Brief summary

When using the K6 Prometheus Remote Write Extension samples are being dropped under loads of 700TPS No custom tags are being used and Prometheus doesn't appear to be under CPU or Memory stress The same problem occurs if we remote write to Mimir.

k6 version

k6 = v0.38.0 extension = v0.0.2

OS

Windows

Docker version and image (if applicable)

No response

Steps to reproduce the problem

Simple K6 test using the K6 prometheus remote write extension.

import http from 'k6/http';
import { sleep } from 'k6';

export default function () {

  let res1 = http.get('http://simple apache endpoint',);
  sleep(.0001)
}
{
    "stages": [
        {
        "duration": "10s",
        "target": 5
        },
        {
        "duration": "600s",
        "target": 5
        },
        {
        "duration": "10s",
        "target": 1
        }
            ],

  "noConnectionReuse": true,
  "userAgent": "MyK6UserAgentString/1.0"
}

Expected behaviour

Samples should be written out within 1 sec flush period for the tested TPS. We would like to run error free at 1200TPS

Actual behaviour

WARN[0432] Remote write took 5.1603032s while flush period is 1s. Some samples may be dropped. nts=150005

na-- commented 2 years ago

this issue was originally reported in the main k6 repo, but it seems to be for https://github.com/grafana/xk6-output-prometheus-remote, so I'll move it there

codebien commented 2 years ago

Hi @perrinj3, we are already tracking this issue in https://github.com/grafana/xk6-output-prometheus-remote/issues/10.

When we will merge https://github.com/grafana/xk6-output-prometheus-remote/pull/38 it should help in reducing this load.

Unfortunately, with the current implementation, skipping tags as suggested here could be the unique option in reducing the amount of data to deliver.

We are actively working on the main issue on the k6 side for resolving the root issue, so we will be able to provide a better-aggregated view of the different metrics.

I close this issue, feel free to re-open or add more observations directly in https://github.com/grafana/xk6-output-prometheus-remote/issues/10.