logql: Metric aggregations don't work on labels extracted by logfmt parser if using parameters

grafana / loki

Like Prometheus, but for logs.

GNU Affero General Public License v3.0

23.48k stars 3.4k forks source link

Describe the bug When using the logfmt parser with parameters (e.g. | logfmt status, method="request_method") for metric queries, the extracted fields show properly in the logs sample if aggregating by a different field, but are not able to be used for aggregations and filtering is inconsistent.

For example:

sum by (method) (
  rate({component="example"} | logfmt method="request_method", status |  __error__ = `` | status != `` [1m])
)

That query will return no data because it chokes on that status filter.

If I remove the pipe to check for an empty status value (or change it to something like status!="404"), it will work but return incorrect results with only one series ({method=""}):

sum by (method) (
  rate({component="example"} | logfmt method="request_method", status |  __error__ = `` [1m])
)

However, if I remove the parameters and instead use label_format, the query will succeed with the correct results (even with the status filter still in place):

sum by (method) (
  rate({component="example"} | logfmt | label_format method=request_method |  __error__ = `` | status != `` [1m])
)

To Reproduce Steps to reproduce the behavior:

Run a metrics query using logfmt to extract specific fields
Attempt to sum by an extracted field
Query returns no data

Expected behavior I expect to be able to aggregate by labels extracted by the logfmt parser even when specifying the specific labels to extract via parameters.

Environment:

Infrastructure: Kubernetes
Deployment tool: Helm
Loki version: 2.9.1

Screenshots, Promtail config, or terminal output N/A

sum by (instance, datacenter, cluster, logicalcluster, environment, method, status, rgw_status)( rate( {component="ceph", instance=~"myserver.+", path="/var/log/nginx/access.log"} | logfmt status, request_method, us_statuses | __error__="" | line_format `status="{{ .status }}" method="{{ .request_method }}" rgw_status="{{ .us_statuses }}"` | logfmt [1m] ) )

sum by (instance, datacenter, cluster, logicalcluster, environment, method, status, rgw_status)( label_replace( label_replace( rate( {component="ceph", instance=~"myserver.+", path="/var/log/nginx/access.log"} | logfmt status, request_method, us_statuses | __error__="" [1m] ), "method", "$1", "request_method", "(.*)" ), "rgw_status", "$1", "us_statuses", "(.*)" ) )

grafana / loki

logql: Metric aggregations don't work on labels extracted by logfmt parser if using parameters #11334

Using `line_format` and another `logfmt` stage

Using `label_replace`

grafana / loki

logql: Metric aggregations don't work on labels extracted by logfmt parser if using parameters #11334

Using line_format and another logfmt stage

Using label_replace

Using `line_format` and another `logfmt` stage

Using `label_replace`