Closed perosb closed 2 weeks ago
Yeah, I doubt this is a solr operator bug. But have you tried running the same version of Solr and the Solr Prometheus exporter?
I believe this is an issue with the solr-prometheus-exporter. I've tried with the same, and latest version. it seems to be duplicating, at least in my case, the /admin/ping handler.
solr_metrics_core_time_seconds_total{category="ADMIN",handler="/admin/ping",core="my-cool-core",collection="my-cool-collection",shard="shard1",replica="replica_n1",base_url="http://solr-qa-solrcloud-0.solr-qa-solrcloud-headless.solr-qa:8983/solr",} 6.4896774849E7
solr_metrics_core_time_seconds_total{category="ADMIN",handler="/admin/ping",core="my-cool-core",collection="my-cool-collection",shard="shard1",replica="replica_n1",base_url="http://solr-qa-solrcloud-0.solr-qa-solrcloud-headless.solr-qa:8983/solr",} 0.0
that is really strange. Maybe put something on the solr users list and see if anyone has help?
Do you have a sample of what prometheus-exporter config file is being used?
cat /opt/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
<?xml version="1.0" encoding="UTF-8" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<config>
<!--
Templates to help reduce jq boilerplate used by many metrics in this config;
mainly intended for metrics that don't require a bunch of jq magic to work and are mostly boilerplate.
A regex with named groups is used to match template references to template + vars using the basic pattern:
$jq:<TEMPLATE>( <UNIQUE>, <KEYSELECTOR>, <METRIC>, <TYPE> )
For instance,
$jq:core(requests_total, endswith(".requestTimes"), count, COUNTER)
TEMPLATE = core
UNIQUE = requests_total (unique suffix for this metric, results in a metric named "solr_metrics_core_requests_total")
KEYSELECTOR = endswith(".requestTimes") (filter to select the specific key for this metric)
METRIC = count
TYPE = COUNTER
Some templates may have a default type, so you can omit that from your template reference, such as:
$jq:core(requests_total, endswith(".requestTimes"), count)
Uses the defaultType=COUNTER as many uses of the core template are counts.
If a template reference omits the metric, then the unique suffix is used, for instance:
$jq:core-query(1minRate, endswith(".distrib.requestTimes"))
Creates a GAUGE metric (default type) named "solr_metrics_core_query_1minRate" using the 1minRate value from the selected JSON object.
Add templates as needed, three metrics using the same structure feels about right as the threshold for creating a new template.
-->
<jq-templates>
<template name="core-query" defaultType="GAUGE">
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | {KEYSELECTOR} | select (.value | type == "object") as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
select($category | startswith("QUERY")) |
select($handler | startswith("/")) |
{METRIC} as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_query_{UNIQUE}",
type: "{TYPE}",
help: "See: https://lucene.apache.org/solr/guide/performance-statistics-reference.html",
label_names: ["category", "searchHandler", "core"],
label_values: [$category, $handler, $core],
value: $value
}
else
{
name: "solr_metrics_core_query_{UNIQUE}",
type: "{TYPE}",
help: "See: https://lucene.apache.org/solr/guide/performance-statistics-reference.html",
label_names: ["category", "searchHandler", "core", "collection", "shard", "replica"],
label_values: [$category, $handler, $core, $collection, $shard, $replica],
value: $value
}
end
</template>
<template name="core" defaultType="COUNTER">
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | {KEYSELECTOR} as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
select($handler | startswith("/")) |
{METRIC} as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core"],
label_values: [$category, $handler, $core],
value: $value
}
else
{
name: "solr_metrics_core_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core", "collection", "shard", "replica"],
label_values: [$category, $handler, $core, $collection, $shard, $replica],
value: $value
}
end
</template>
<template name="update-handler" defaultType="COUNTER">
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | {KEYSELECTOR} as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
{METRIC} as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core"],
label_values: [$category, $handler, $core],
value: $value
}
else
{
name: "solr_metrics_core_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core", "collection", "shard", "replica"],
label_values: [$category, $handler, $core, $collection, $shard, $replica],
value: $value
}
end
</template>
<template name="node" defaultType="COUNTER">
.metrics["solr.node"] | to_entries | .[] | {KEYSELECTOR} as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
{METRIC} as $value |
{
name : "solr_metrics_node_{UNIQUE}",
type : "{TYPE}",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["category", "handler"],
label_values : [$category, $handler],
value : $value
}
</template>
<template name="cache-searcher" defaultType="GAUGE">
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key | startswith("CACHE.searcher.")) | select (.key | endswith("documentCache") or endswith("fieldValueCache") or endswith("filterCache") or endswith("perSegFilter") or endswith("queryResultCache")) as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[2] as $type |
$object.value | to_entries | .[] | {KEYSELECTOR} as $target |
$target.key as $item |
{METRIC} as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_searcher_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "type", "item"],
label_values: [$category, $core, $type, $item],
value: $value
}
else
{
name: "solr_metrics_core_searcher_{UNIQUE}",
type: "{TYPE}",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica", "type", "item"],
label_values: [$category, $core, $collection, $shard, $replica, $type, $item],
value: $value
}
end
</template>
<template name="node-thread-pool" defaultType="COUNTER">
.metrics["solr.node"] | to_entries | .[] | select(.key | contains(".threadPool.")) | {KEYSELECTOR} as $object |
$object.key | split(".") as $key_items |
$key_items | length as $label_len |
$key_items[0] as $category |
(if $label_len >= 5 then $key_items[1] else "" end) as $handler |
(if $label_len >= 5 then $key_items[3] else $key_items[2] end) as $executor |
{METRIC} as $value |
{
name : "solr_metrics_node_thread_pool_{UNIQUE}",
type : "{TYPE}",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["category", "handler", "executor"],
label_values : [$category, $handler, $executor],
value : $value
}
</template>
<template name="jvm-item" defaultType="GAUGE">
.metrics["solr.jvm"] | to_entries | .[] | {KEYSELECTOR} as $object |
$object.key | split(".") | last as $item |
{METRIC} as $value |
{
name : "solr_metrics_jvm_{UNIQUE}",
type : "{TYPE}",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : [$item],
value : $value
}
</template>
</jq-templates>
<rules>
<ping>
<lst name="request">
<lst name="query">
<str name="path">/admin/ping</str>
</lst>
<arr name="jsonQueries">
<str>
. as $object | $object |
(if $object.status == "OK" then 1.0 else 0.0 end) as $value |
{
name : "solr_ping",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/ping.html",
label_names : [],
label_values : [],
value : $value
}
</str>
</arr>
</lst>
</ping>
<metrics>
<lst name="request">
<lst name="query">
<str name="path">/admin/metrics</str>
<lst name="params">
<!--
trim some of these expressions as needed if you don't care about
a particular group of metrics.
-->
<str name="expr">solr\.jetty:.*DefaultHandler.*</str>
<str name="expr">solr\.jvm:.*</str>
<str name="expr">solr\.node:.*</str>
<str name="expr">solr\.overseer:.*</str>
<str name="expr">solr\.core\..*:QUERY\..*</str>
<str name="expr">solr\.core\..*:ADMIN\..*</str>
<str name="expr">solr\.core\..*:CACHE\..*</str>
<str name="expr">solr\.core\..*:UPDATE\.updateHandler\..*</str>
<str name="expr">solr\.core\..*:CORE\.fs\..*</str>
<str name="expr">solr\.core\..*:HIGHLIGHTER\..*</str>
<str name="expr">solr\.core\..*:INDEX\..*</str>
<str name="expr">solr\.core\..*:REPLICATION\.replication\..*</str>
<str name="expr">solr\.core\..*:SEARCHER\.searcher\..*</str>
<!-- Alternative expressions, which are much stricter but still provide
enough data to populate the default dashboard.
These expressions omit many unused properties of the complex metrics,
and also skip whole groups of rarely used metrics: core ADMIN, REPLICATION,
HIGHLIGHTER, and selects only the most common QUERY handlers.
In order to use these expressions remove the default list of expressions
above and the START / END lines below. -->
<!-- === START ===
<str name="expr">solr\.jetty:.*\.DefaultHandler\.(dispatches|.*-requests|.*xx-responses):count</str>
<str name="expr">solr\.jvm:(buffers|gc).*</str>
<str name="expr">solr\.jvm:memory\.(heap|non-heap|pools)\.*\.usage</str>
<str name="expr">solr\.jvm:memory\.total</str>
<str name="expr">solr\.jvm:os\..*(FileDescriptorCount|Load.*|Size|processCpuTime)</str>
<str name="expr">solr\.jvm:threads\..*count</str>
<str name="expr">solr\.node:CONTAINER\.(cores|fs).*</str>
<str name="expr">solr\.core\..*:CORE\.fs\..*Space</str>
<str name="expr">solr\.core\..*:INDEX\.sizeInBytes</str>
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\..*requestTimes:(count|1minRate|5minRate|median_ms|meanRate|p75_ms|p95_ms|p99_ms)</str>
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\.totalTime</str>
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\..*rrors:(count!1minRate)</str>
<str name="expr">solr\.core\..*:SEARCHER\.searcher\..*Doc.*</str>
<str name="expr">solr\.core\..*:UPDATE\.updateHandler\..*</str>
<str name="expr">solr\core\..*:CACHE\..*</str>
=== END === -->
</lst>
</lst>
<arr name="jsonQueries">
<!--
jetty metrics
-->
<str>
.metrics["solr.jetty"] | to_entries | .[] | select(.key | startswith("org.eclipse.jetty.server.handler.DefaultHandler")) | select(.key | endswith("xx-responses")) as $object |
$object.key | split(".") | last | split("-") | first as $status |
$object.value.count as $value |
{
name : "solr_metrics_jetty_response_total",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["status"],
label_values : [$status],
value : $value
}
</str>
<str>
.metrics["solr.jetty"] | to_entries | .[] | select(.key | startswith("org.eclipse.jetty.server.handler.DefaultHandler.")) | select(.key | endswith("-requests")) | select (.value | type == "object") as $object |
$object.key | split(".") | last | split("-") | first as $method |
$object.value.count as $value |
{
name : "solr_metrics_jetty_requests_total",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["method"],
label_values : [$method],
value : $value
}
</str>
<str>
.metrics["solr.jetty"] | to_entries | .[] | select(.key == "org.eclipse.jetty.server.handler.DefaultHandler.dispatches") as $object |
$object.value.count as $value |
{
name : "solr_metrics_jetty_dispatches_total",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : [],
label_values : [],
value : $value
}
</str>
<!--
jvm metrics
-->
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("buffers.")) | select(.key | endswith(".Count")) as $object |
$object.key | split(".")[1] as $pool |
$object.value as $value |
{
name : "solr_metrics_jvm_buffers",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["pool"],
label_values : [$pool],
value : $value
}
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("buffers.")) | select(.key | (endswith(".MemoryUsed") or endswith(".TotalCapacity"))) as $object |
$object.key | split(".")[1] as $pool |
$object.key | split(".") | last as $item |
$object.value as $value |
{
name : "solr_metrics_jvm_buffers_bytes",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["pool", "item"],
label_values : [$pool, $item],
value : $value
}
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("gc.")) | select(.key | endswith(".count")) as $object |
$object.key | split(".")[1] as $item |
$object.value as $value |
{
name : "solr_metrics_jvm_gc_total",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : [$item],
value : $value
}
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("gc.")) | select(.key | endswith(".time")) as $object |
$object.key | split(".")[1] as $item |
($object.value / 1000) as $value |
{
name : "solr_metrics_jvm_gc_seconds_total",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : [$item],
value : $value
}
</str>
<str>
$jq:jvm-item(memory_heap_bytes,
select(.key | startswith("memory.heap.")) | select(.key | endswith(".usage") | not),
object.value)
</str>
<str>
$jq:jvm-item(memory_non_heap_bytes,
select(.key | startswith("memory.non-heap.")) | select(.key | endswith(".usage") | not),
object.value)
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("memory.pools.")) | select(.key | endswith(".usage") | not) as $object |
$object.key | split(".")[2] as $space |
$object.key | split(".") | last as $item |
$object.value as $value |
{
name : "solr_metrics_jvm_memory_pools_bytes",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["space", "item"],
label_values : [$space, $item],
value : $value
}
</str>
<str>
$jq:jvm-item(memory_bytes, select(.key | startswith("memory.total.")), object.value)
</str>
<str>
$jq:jvm-item(os_memory_bytes,
select(.key == "os.committedVirtualMemorySize" or .key == "os.freePhysicalMemorySize" or .key == "os.freeSwapSpaceSize" or .key =="os.totalPhysicalMemorySize" or .key == "os.totalSwapSpaceSize"),
object.value)
</str>
<str>
$jq:jvm-item(os_file_descriptors, select(.key == "os.maxFileDescriptorCount" or .key == "os.openFileDescriptorCount"), object.value)
</str>
<str>
$jq:jvm-item(os_cpu_load, select(.key == "os.processCpuLoad" or .key == "os.systemCpuLoad"), object.value)
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key == "os.processCpuTime") as $object |
($object.value / 1000.0) as $value |
{
name : "solr_metrics_jvm_os_cpu_time_seconds",
type : "COUNTER",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : ["processCpuTime"],
value : $value
}
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key == "os.systemLoadAverage") as $object |
$object.value as $value |
{
name : "solr_metrics_jvm_os_load_average",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : ["systemLoadAverage"],
value : $value
}
</str>
<str>
.metrics["solr.jvm"] | to_entries | .[] | select(.key | startswith("threads.")) | select(.key | endswith(".count")) as $object |
$object.key | split(".")[1] as $item |
$object.value as $value |
{
name : "solr_metrics_jvm_threads",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["item"],
label_values : [$item],
value : $value
}
</str>
<!--
overseer metrics
-->
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.overseer")) as $object |
$object.value as $value | $value | to_entries | .[] |
select(.key | startswith("queue.") and endswith("collectionWorkQueueSize")) as $object |
$object.value as $value |
{
name : "solr_metrics_overseer_collectionWorkQueueSize",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : [],
label_values : [],
value : $value
}
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.overseer")) as $object |
$object.value as $value | $value | to_entries | .[] |
select(.key | startswith("queue.") and endswith("stateUpdateQueueSize")) as $object |
$object.value as $value |
{
name : "solr_metrics_overseer_stateUpdateQueueSize",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : [],
label_values : [],
value : $value
}
</str>
<!--
node metrics
-->
<str>
$jq:node(client_errors_total, select(.key | endswith(".clientErrors")), count)
</str>
<str>
$jq:node(errors_total, select(.key | endswith(".errors")), count)
</str>
<str>
$jq:node(requests_total, select(.key | endswith(".local.requestTimes")), count)
</str>
<str>
$jq:node(server_errors_total, select(.key | endswith(".serverErrors")), count)
</str>
<str>
$jq:node(timeouts_total, select(.key | endswith(".timeouts")), count)
</str>
<str>
$jq:node(time_seconds_total, select(.key | endswith(".local.totalTime")), ($object.value / 1000))
</str>
<str>
.metrics["solr.node"] | to_entries | .[] | select(.key | startswith("CONTAINER.cores.")) as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[2] as $item |
$object.value as $value |
{
name : "solr_metrics_node_cores",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["category", "item"],
label_values : [$category, $item],
value : $value
}
</str>
<str>
.metrics["solr.node"] | to_entries | .[] | select(.key | startswith("CONTAINER.fs.coreRoot.")) | select(.key | endswith(".totalSpace") or endswith(".usableSpace")) as $object |
$object.key | split(".") as $key_items |
$key_items | length as $label_len |
$key_items[0] as $category |
$key_items[3] as $item |
$object.value as $value |
{
name : "solr_metrics_node_core_root_fs_bytes",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["category", "item"],
label_values : [$category, $item],
value : $value
}
</str>
<str>
$jq:node-thread-pool(completed_total, select(.key | endswith(".completed")), count)
</str>
<str>
$jq:node-thread-pool(running, select(.key | endswith(".running")), object.value, GAUGE)
</str>
<str>
$jq:node-thread-pool(submitted_total, select(.key | endswith(".submitted")), count)
</str>
<str>
.metrics["solr.node"] | to_entries | .[] | select(.key | endswith("Connections")) as $object |
$object.key | split(".") as $key_items |
$key_items | length as $label_len |
$key_items[0] as $category |
$key_items[1] as $handler |
$key_items[2] as $item |
$object.value as $value |
{
name : "solr_metrics_node_connections",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names : ["category", "handler", "item"],
label_values : [$category, $handler, $item],
value : $value
}
</str>
<!--
Query related core metrics; see jq-templates for details on the core-query template used below
-->
<str>
$jq:core-query(errors_1minRate, select(.key | endswith(".errors")), 1minRate)
</str>
<str>
$jq:core-query(client_errors_1minRate, select(.key | endswith(".clientErrors")), 1minRate)
</str>
<str>
$jq:core-query(1minRate, select(.key | endswith(".distrib.requestTimes")), 1minRate)
</str>
<str>
$jq:core-query(5minRate, select(.key | endswith(".distrib.requestTimes")), 5minRate)
</str>
<str>
$jq:core-query(median_ms, select(.key | endswith(".distrib.requestTimes")), median_ms)
</str>
<str>
$jq:core-query(p75_ms, select(.key | endswith(".distrib.requestTimes")), p75_ms)
</str>
<str>
$jq:core-query(p95_ms, select(.key | endswith(".distrib.requestTimes")), p95_ms)
</str>
<str>
$jq:core-query(p99_ms, select(.key | endswith(".distrib.requestTimes")), p99_ms)
</str>
<str>
$jq:core-query(mean_rate, select(.key | endswith(".distrib.requestTimes")), meanRate)
</str>
<!-- Local (non-distrib) query metrics -->
<str>
$jq:core-query(local_1minRate, select(.key | endswith(".local.requestTimes")), 1minRate)
</str>
<str>
$jq:core-query(local_5minRate, select(.key | endswith(".local.requestTimes")), 5minRate)
</str>
<str>
$jq:core-query(local_median_ms, select(.key | endswith(".local.requestTimes")), median_ms)
</str>
<str>
$jq:core-query(local_p75_ms, select(.key | endswith(".local.requestTimes")), p75_ms)
</str>
<str>
$jq:core-query(local_p95_ms, select(.key | endswith(".local.requestTimes")), p95_ms)
</str>
<str>
$jq:core-query(local_p99_ms, select(.key | endswith(".local.requestTimes")), p99_ms)
</str>
<str>
$jq:core-query(local_mean_rate, select(.key | endswith(".local.requestTimes")), meanRate)
</str>
<str>
$jq:core-query(local_count, select(.key | endswith(".local.requestTimes")), count, COUNTER)
</str>
<!-- core metrics other than query -->
<str>
$jq:core(client_errors_total, select(.key | endswith(".clientErrors")), count)
</str>
<str>
$jq:core(errors_total, select(.key | endswith(".errors")) | select (.value | type == "object"), count)
</str>
<str>
$jq:core(requests_total, select(.key | endswith(".requestTimes")) | select (.value | type == "object"), count)
</str>
<str>
$jq:core(server_errors_total, select(.key | endswith(".serverErrors")) | select (.value | type == "object"), count)
</str>
<str>
$jq:core(timeouts_total, select(.key | endswith(".timeouts")) | select (.value | type == "object"), count)
</str>
<str>
$jq:core(time_seconds_total, select(.key | endswith(".totalTime")), ($object.value / 1000))
</str>
<str>
.metrics | to_entries | .[] | select (.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key == "CACHE.core.fieldCache") as $object |
$object.key | split(".")[0] as $category |
$object.value.entries_count as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_field_cache_total",
type: "COUNTER",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core"],
label_values: [$category, $core],
value: $value
}
else
{
name: "solr_metrics_core_field_cache_total",
type: "COUNTER",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica"],
label_values: [$category, $core, $collection, $shard, $replica],
value: $value
}
end
</str>
<str>
$jq:update-handler(update_handler_adds, select(.key == "UPDATE.updateHandler.adds"), object.value, GAUGE)
</str>
<str>
$jq:update-handler(update_handler_auto_commits_total, select(.key == "UPDATE.updateHandler.autoCommits"), object.value)
</str>
<str>
$jq:update-handler(update_handler_commits_total, select(.key == "UPDATE.updateHandler.commits"), count)
</str>
<str>
$jq:update-handler(update_handler_adds_total, select(.key == "UPDATE.updateHandler.cumulativeAdds"), count)
</str>
<str>
$jq:update-handler(update_handler_deletes_by_id_total, select(.key == "UPDATE.updateHandler.cumulativeDeletesById"), count)
</str>
<str>
$jq:update-handler(update_handler_deletes_by_query_total, select(.key == "UPDATE.updateHandler.cumulativeDeletesByQuery"), count)
</str>
<str>
$jq:update-handler(update_handler_errors_total, select(.key == "UPDATE.updateHandler.cumulativeErrors"), count)
</str>
<str>
$jq:update-handler(update_handler_deletes_by_id, select(.key == "UPDATE.updateHandler.deletesById"), object.value, GAUGE)
</str>
<str>
$jq:update-handler(update_handler_deletes_by_query, select(.key == "UPDATE.updateHandler.deletesByQuery"), object.value, GAUGE)
</str>
<str>
$jq:update-handler(update_handler_pending_docs, select(.key == "UPDATE.updateHandler.docsPending"), object.value, GAUGE)
</str>
<str>
$jq:update-handler(update_handler_errors, select(.key == "UPDATE.updateHandler.errors"), object.value, GAUGE)
</str>
<str>
$jq:update-handler(update_handler_expunge_deletes_total, select(.key == "UPDATE.updateHandler.expungeDeletes"), count)
</str>
<str>
$jq:update-handler(update_handler_merges_total, select(.key == "UPDATE.updateHandler.merges"), count)
</str>
<str>
$jq:update-handler(update_handler_optimizes_total, select(.key == "UPDATE.updateHandler.optimizes"), count)
</str>
<str>
$jq:update-handler(update_handler_rollbacks_total, select(.key == "UPDATE.updateHandler.rollbacks"), count)
</str>
<str>
$jq:update-handler(update_handler_soft_auto_commits_total, select(.key == "UPDATE.updateHandler.softAutoCommits"), object.value)
</str>
<str>
$jq:update-handler(update_handler_splits_total, select(.key == "UPDATE.updateHandler.splits"), count)
</str>
<str>
$jq:cache-searcher(cache, select(.key == "lookups" or .key == "hits" or .key == "size" or .key == "evictions" or .key == "inserts"), $target.value)
</str>
<str>
$jq:cache-searcher(cache_ratio, select(.key == "hitratio"), $target.value)
</str>
<str>
$jq:cache-searcher(warmup_time_seconds, select(.key == "warmupTime"), ($target.value / 1000))
</str>
<str>
$jq:cache-searcher(cumulative_cache_total,
select(.key == "cumulative_lookups" or .key == "cumulative_hits" or .key == "cumulative_evictions" or .key == "cumulative_inserts"),
$target.value,
COUNTER)
</str>
<str>
$jq:cache-searcher(cumulative_cache_ratio, select(.key == "cumulative_hitratio"), $target.value)
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key | startswith("CORE.fs.")) | select (.key | endswith(".totalSpace") or endswith(".usableSpace")) as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[2] as $item |
$object.value as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_fs_bytes",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "item"],
label_values: [$category, $core, $item],
value: $value
}
else
{
name: "solr_metrics_core_fs_bytes",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica", "item"],
label_values: [$category, $core, $collection, $shard, $replica, $item],
value: $value
}
end
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key | startswith("HIGHLIGHTER.")) | select (.key | endswith(".requests")) as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $name |
$object.key | split(".")[2] as $item |
$object.value as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_highlighter_request_total",
type: "COUNTER",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "name", "item"],
label_values: [$category, $core, $name, $item],
value: $value
}
else
{
name: "solr_metrics_core_highlighter_request_total",
type: "COUNTER",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica", "name", "item"],
label_values: [$category, $core, $collection, $shard, $replica, $name, $item],
value: $value
}
end
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key == "INDEX.sizeInBytes") as $object |
$object.key | split(".")[0] as $category |
$object.value as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_index_size_bytes",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core"],
label_values: [$category, $core],
value: $value
}
else
{
name: "solr_metrics_core_index_size_bytes",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica"],
label_values: [$category, $core, $collection, $shard, $replica],
value: $value
}
end
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key == "REPLICATION./replication.isMaster") as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
(if $object.value == true then 1.0 else 0.0 end) as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_replication_master",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core"],
label_values: [$category, $handler, $core],
value: $value
}
else
{
name: "solr_metrics_core_replication_master",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core", "collection", "shard", "replica"],
label_values: [$category, $handler, $core, $collection, $shard, $replica],
value: $value
}
end
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key == "REPLICATION./replication.isSlave") as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[1] as $handler |
(if $object.value == true then 1.0 else 0.0 end) as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_replication_slave",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core"],
label_values: [$category, $handler, $core],
value: $value
}
else
{
name: "solr_metrics_core_replication_slave",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "handler", "core", "collection", "shard", "replica"],
label_values: [$category, $handler, $core, $collection, $shard, $replica],
value: $value
}
end
</str>
<str>
.metrics | to_entries | .[] | select(.key | startswith("solr.core.")) as $parent |
$parent.key | split(".") as $parent_key_items |
$parent_key_items | length as $parent_key_item_len |
(if $parent_key_item_len == 3 then $parent_key_items[2] else "" end) as $core |
(if $parent_key_item_len == 5 then $parent_key_items[2] else "" end) as $collection |
(if $parent_key_item_len == 5 then $parent_key_items[3] else "" end) as $shard |
(if $parent_key_item_len == 5 then $parent_key_items[4] else "" end) as $replica |
(if $parent_key_item_len == 5 then ($collection + "_" + $shard + "_" + $replica) else $core end) as $core |
$parent.value | to_entries | .[] | select(.key == "SEARCHER.searcher.deletedDocs" or .key == "SEARCHER.searcher.maxDoc" or .key == "SEARCHER.searcher.numDocs") as $object |
$object.key | split(".")[0] as $category |
$object.key | split(".")[2] as $item |
$object.value as $value |
if $parent_key_item_len == 3 then
{
name: "solr_metrics_core_searcher_documents",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "item"],
label_values: [$category, $core, $item],
value: $value
}
else
{
name: "solr_metrics_core_searcher_documents",
type: "GAUGE",
help: "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html",
label_names: ["category", "core", "collection", "shard", "replica", "item"],
label_values: [$category, $core, $collection, $shard, $replica, $item],
value: $value
}
end
</str>
</arr>
</lst>
</metrics>
<collections>
<lst name="request">
<lst name="query">
<str name="path">/admin/collections</str>
<lst name="params">
<str name="action">CLUSTERSTATUS</str>
</lst>
</lst>
<arr name="jsonQueries">
<str>
.cluster.live_nodes | length as $value|
{
name : "solr_collections_live_nodes",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : [],
label_values : [],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.pullReplicas | tonumber as $value |
{
name : "solr_collections_pull_replicas",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection"],
label_values : [$collection],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.nrtReplicas | tonumber as $value |
{
name : "solr_collections_nrt_replicas",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection"],
label_values : [$collection],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.tlogReplicas | tonumber as $value |
{
name : "solr_collections_tlog_replicas",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection"],
label_values : [$collection],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.shards | to_entries | .[] | . as $shard_obj |
$shard_obj.key as $shard |
(if $shard_obj.value.state == "active" then 1.0 else 0.0 end) as $value |
{
name : "solr_collections_shard_state",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection","shard"],
label_values : [$collection,$shard],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.shards | to_entries | .[] | . as $shard_obj |
$shard_obj.key as $shard |
$shard_obj.value.replicas | to_entries | .[] | . as $replica_obj |
$replica_obj.key as $replica_name |
$replica_obj.value.core as $core |
$core[$collection + "_" + $shard + "_" | length:] as $replica |
$replica_obj.value.base_url as $base_url |
$replica_obj.value.node_name as $node_name |
$replica_obj.value.type as $type |
$replica_obj.value.state as $state |
(if $state == "active" then 1.0 else 0.0 end) as $value |
{
name : "solr_collections_replica_state",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection", "shard", "replica", "replica_name", "core", "base_url", "node_name", "type", "state"],
label_values : [$collection, $shard, $replica, $replica_name, $core, $base_url, $node_name, $type, $state],
value : $value
}
</str>
<str>
.cluster.collections | to_entries | .[] | . as $object |
$object.key as $collection |
$object.value.shards | to_entries | .[] | . as $shard_obj |
$shard_obj.key as $shard |
$shard_obj.value.replicas | to_entries | .[] | . as $replica_obj |
$replica_obj.key as $replica_name |
$replica_obj.value.core as $core |
$core[$collection + "_" + $shard + "_" | length:] as $replica |
$replica_obj.value.base_url as $base_url |
$replica_obj.value.node_name as $node_name |
$replica_obj.value.type as $type |
(if $replica_obj.value.leader == "true" then 1.0 else 0.0 end) as $value |
{
name : "solr_collections_shard_leader",
type : "GAUGE",
help : "See following URL: https://lucene.apache.org/solr/guide/collections-api.html#clusterstatus",
label_names : ["collection", "shard", "replica", "replica_name", "core", "base_url", "node_name", "type"],
label_values : [$collection, $shard, $replica, $replica_name, $core, $base_url, $node_name, $type],
value : $value
}
</str>
</arr>
</lst>
<lst name="request">
<lst name="query">
<str name="path">/admin/zookeeper/status</str>
</lst>
<arr name="jsonQueries">
<str>
.zkStatus.ensembleSize as $value |
.zkStatus.mode as $mode |
{
name : "solr_zookeeper_ensemble_size",
type : "GAUGE",
help : "See following URL: https://solr.apache.org/guide/cloud-screens.html#zk-status-view",
label_names : [],
label_values : [],
value : $value
}
</str>
<str>
.zkStatus.details[] as $object |
$object.host as $host |
$object.ok as $ok |
(if $object.clientPort != null and $ok then 1.0 else 0.0 end) as $value |
{
name : "solr_zookeeper_nodestatus",
type : "GAUGE",
help : "See following URL: https://solr.apache.org/guide/cloud-screens.html#zk-status-view",
label_names : ["host"],
label_values : [$host],
value : $value
}
</str>
<str>
.zkStatus.status as $statusText |
(if $statusText == "green" then 1.0 else 0.0 end) as $value |
{
name : "solr_zookeeper_status",
type : "GAUGE",
help : "See following URL: https://solr.apache.org/guide/cloud-screens.html#zk-status-view",
label_names : ["status"],
label_values : [$statusText],
value : $value
}
</str>
</arr>
</lst>
</collections>
<!--
<search>
<lst name="request">
<lst name="query">
<str name="collection">collection1</str>
<str name="path">/select</str>
<lst name="params">
<str name="q">*:*</str>
<str name="start">0</str>
<str name="rows">0</str>
<str name="json.facet">
{
category: {
type: terms,
field: cat
}
}
</str>
</lst>
</lst>
<arr name="jsonQueries">
<str>
.facets.category.buckets[] as $object |
$object.val as $term |
$object.count as $value |
{
name : "solr_facets_category",
type : "GAUGE",
help : "Category facets",
label_names : ["term"],
label_values : [$term],
value : $value
}
</str>
</arr>
</lst>
</search>
-->
</rules>
</config>
this should be the default config that ships with 8.11.3, I saw the same behavior with 9.x as well.
Did a bit of digging and I think the cause is because of the metrics api in Solr being different from 8->9.
Solr 8:
"ADMIN./admin/ping.totalTime":4869095628, "ADMIN./admin/ping.distrib.totalTime":2035581611, "ADMIN./admin/ping.local.totalTime":0,
Solr 9:
"ADMIN./admin/ping.totalTime":3739370399, "ADMIN./admin/ping.totalTime":3994744966,
I think the prometheus exporter is scraping Solr 8's api but didn't append the correct labels of distrib
. Solr 9 had a change that removed distrib
and that kind of fixed the problem in Solr 9. I haven't actually tested this and just did digging but the config in needs to be changed for Solr 8's prometheus exporter to probably remove the duplicate metric. I'd give it a shot if I were you or maybe delete the <str name="expr">solr\.core\..*:ADMIN\..*</str>
line to remove that metric from being output from the prometheus exporter
Did a bit of digging and I think the cause is because of the metrics api in Solr being different from 8->9. Solr 8:
"ADMIN./admin/ping.totalTime":4869095628, "ADMIN./admin/ping.distrib.totalTime":2035581611, "ADMIN./admin/ping.local.totalTime":0,
Solr 9:
"ADMIN./admin/ping.totalTime":3739370399, "ADMIN./admin/ping.totalTime":3994744966,
I think the prometheus exporter is scraping Solr 8's api but didn't append the correct labels of
distrib
. Solr 9 had a change that removeddistrib
and that kind of fixed the problem in Solr 9. I haven't actually tested this and just did digging but the config in needs to be changed for Solr 8's prometheus exporter to probably remove the duplicate metric. I'd give it a shot if I were you or maybe delete the<str name="expr">solr\.core\..*:ADMIN\..*</str>
line to remove that metric from being output from the prometheus exporter
Thanks for digging into this. I removed the ADMIN parts from exporter config and those duplicates are gone, however the more important metrics for QUERY still occur. These metrics are more interesting and should probably stay.
Got it. What if you try updating the expression to omit distrib
and local
?
Try using this expression instead to grab all metrics except admin/ping
holding distrib
and local
solr\.core\..*:ADMIN\.\/admin\/ping\.(?!distrib)(?!local).*
Thanks for the help @mlbiscoc.
I changed the exporter config and activated the Alternative expressions, which are much stricter but still provide enough data to populate the default dashboard.
Then I added (?!distrib)(?!local)
as you said:
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\.(?!distrib)(?!local).*requestTimes:(count|1minRate|5minRate|median_ms|meanRate|p75_ms|p95_ms|p99_ms)</str>
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\.totalTime</str>
<str name="expr">solr\.core\..*:QUERY\./(select|get|export|stream|query|graph|sql)\.(?!distrib)(?!local).*rrors:(count!1minRate)</str>
Edit: Just noticed I'm not getting any Query metrics tho :scream: Need to revisit this later.... :cry:
same issue, did you find a solution finally ?
same issue, did you find a solution finally ?
No not yet, but I assume the idea of excluding distrib|local should work.
I might have figured out how to fix the issue @perosb.
The alternative expressions exclude the metrics we are interested about, so I had to modify the default pattern, from this:
<str name="expr">solr\.core\..*:QUERY\..*</str>
To this:
<str name="expr">solr\.core\..*:QUERY\.[^.]*\.(?!distrib|local).*</str>
And it's working as @mlbiscoc was expecting.
FYI:
solr_metrics_core_requests_total
and solr_metrics_core_time_seconds_total
_local_
metrics such as solr_metrics_core_query_local_median_ms
or solr_metrics_core_query_local_p99_ms
(which are not used on our side), if you must keep them wou'll want to adapt the regexThanks for your help Matthias
Solr 8.11 seem to be producing duplicate metrics which is flooding the logs and triggering alerts.
This is only when using latest prometheus v2.52.0. Related https://github.com/prometheus/prometheus/issues/14089 This doesn't seem to happen with Solr 9. It still happens when running solr 8.11 and solr-exporter image tag 9.x
This is the debug log with 1 collection and the related duplicates:
Disclaimer: this probably is a Solr bug rather than operator?