Enhanced option for TaskType.command

Description

There exists a "command" task type which executes a URL command.

However, we have a use case of checking core status on all data nodes (instead of a randomly picked node) periodically (instead of once only)

Therefore we will want such task be:

Repeatable
With concurrency/rate/duration configurations and retry/interruption mechanism like other rate controlled tasks
Able to issue GET requests to multiple nodes with optional node type control (data/os/qa etc), instead of 1 randomly picked node in the existing command task type
Collect stats based on http response code, report via the conventional xml result file
Grafana support, report metrics dynamically with finer grain grouping (status code, endpoint url)

Solution

We will keep the existing "command" task type as is for backward compatibility while introducing a new task type org.apache.solr.benchmarks.task.UrlCommandTask that extends the org.apache.solr.benchmarks.task.AbstractTask. For our core status use case, the task type configuration looks like:

    "core-status": {
      "task-by-class": {
        "task-class": "org.apache.solr.benchmarks.task.UrlCommandTask",
        "name": "Periodically call core status",
        "params" : {
          "command" : "${SOLRURL}admin/cores?action=STATUS",
          "node-role": "DATA"
        },
        "min-threads": 1,
        "max-threads": 1,
        "rpm": 12,
        "duration-secs" : 3600
      }
    },

It retains the same url building logic as the existing "command" type (re-use the code with refactoring), but instead of resolving for a single url (of a random node), it resolves to a list of URL based on the cluster state and optional node-type param.

The new class also uses PrometheusExportManager.registerHistogram to report to grafana with command_url and http_status_code with metrics name solr_bench_url_command_duration_bucket

For example, on Grafana with this config:

histogram_quantile(.5, sum by (cluster, le, command_url, http_status_code) (rate(solr_bench_url_command_duration_bucket{}[$window] ) ) )

Produces real time graph :

fullstorydev / solr-bench

Enhanced option for TaskType.command #102

Description

Solution