grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.54k stars 3.4k forks source link

Query-frontend always shows "index_bloom_filter_ratio" and "index_post_bloom_filter_chunks" equal to zero #13798

Open pavelchalyk opened 1 month ago

pavelchalyk commented 1 month ago

Describe the bug Query-frontend shows in logs query performance stats and bloom-filters related metrics ("index_bloom_filter_ratio" and "index_post_bloom_filter_chunks") are always equal to zero, even if bloom-gateway metrics show filtering. Making query like {job="gcplog"} |= ip("10.16.0.0/16") I can see

in bloow-gateway logs something like:

loki-dev-loki-distributed-bloom-gateway-0 {"blocks_fetch_time":"81.608µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":6,"chunks_remaining":0,"chunks_requested":6,"component":"bloom-gateway","duration":"61.951991ms","filter_ratio":1,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"6.148µs","processing_time":"47.554683ms","queue_time":"14.309552ms","series_filtered":1,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.466415927Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-0 {"blocks_fetch_time":"46.053µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":0,"chunks_remaining":12,"chunks_requested":12,"component":"bloom-gateway","duration":"662.03µs","filter_ratio":0,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"4.011µs","processing_time":"518.834µs","queue_time":"93.132µs","series_filtered":0,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.486778237Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"111.993µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":11,"chunks_remaining":6,"chunks_requested":17,"component":"bloom-gateway","duration":"123.760052ms","filter_ratio":0.6470588235294118,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"8.221µs","processing_time":"123.508542ms","queue_time":"131.296µs","series_filtered":2,"series_requested":3,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.475721159Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"89.976µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":0,"chunks_remaining":10,"chunks_requested":10,"component":"bloom-gateway","duration":"73.907195ms","filter_ratio":0,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"5.525µs","processing_time":"62.083538ms","queue_time":"11.728156ms","series_filtered":0,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.485571902Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"89.976µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":0,"chunks_remaining":8,"chunks_requested":8,"component":"bloom-gateway","duration":"74.815977ms","filter_ratio":0,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"5.237µs","processing_time":"62.083538ms","queue_time":"12.637226ms","series_filtered":0,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.485566033Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"33.006µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":3,"chunks_remaining":0,"chunks_requested":3,"component":"bloom-gateway","duration":"83.120799ms","filter_ratio":1,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"3.78µs","processing_time":"83.01064ms","queue_time":"73.373µs","series_filtered":1,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.485632516Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"89.976µs","blocks_processed":1,"caller":"spanlogger.go:109","chunks_filtered":4,"chunks_remaining":1,"chunks_requested":5,"component":"bloom-gateway","duration":"81.048271ms","filter_ratio":0.8,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"3.778µs","processing_time":"62.083538ms","queue_time":"18.870979ms","series_filtered":1,"series_requested":2,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.485729344Z","user":"fake"}
loki-dev-loki-distributed-bloom-gateway-2 {"blocks_fetch_time":"89.976µs","blocks_processed":0,"caller":"spanlogger.go:109","chunks_filtered":0,"chunks_remaining":1,"chunks_requested":1,"component":"bloom-gateway","duration":"73.574885ms","filter_ratio":0,"filters":1,"level":"info","metas_fetch_time":"0s","msg":"stats-report","org_id":"fake","post_processing_time":"1.084µs","processing_time":"62.083538ms","queue_time":"11.400287ms","series_filtered":0,"series_requested":1,"status":"success","tasks":1,"ts":"2024-08-08T08:29:41.485728002Z","user":"fake"}

so here filter_ratio varies from 0 to 1

in index-gateway logs I can see

loki-dev-loki-distributed-index-gateway-1 {"caller":"gateway.go:260","component":"index-gateway","filtered":75,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486222275Z","unfiltered":79}
loki-dev-loki-distributed-index-gateway-1 {"caller":"gateway.go:260","component":"index-gateway","filtered":34,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486236042Z","unfiltered":34}
loki-dev-loki-distributed-index-gateway-1 {"caller":"gateway.go:260","component":"index-gateway","filtered":106,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486252529Z","unfiltered":115}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":91,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.466689071Z","unfiltered":97}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":6,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.476306481Z","unfiltered":18}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":92,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486175664Z","unfiltered":98}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":75,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486185999Z","unfiltered":79}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":115,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486194439Z","unfiltered":125}
loki-dev-loki-distributed-index-gateway-0 {"caller":"gateway.go:260","component":"index-gateway","filtered":106,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.486242105Z","unfiltered":115}
loki-dev-loki-distributed-index-gateway-2 {"caller":"gateway.go:260","component":"index-gateway","filtered":12,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.489059901Z","unfiltered":12}
loki-dev-loki-distributed-index-gateway-2 {"caller":"gateway.go:260","component":"index-gateway","filtered":52,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.516302077Z","unfiltered":52}
loki-dev-loki-distributed-index-gateway-2 {"caller":"gateway.go:260","component":"index-gateway","filtered":52,"level":"info","msg":"return filtered chunk refs","ts":"2024-08-08T08:29:41.517974777Z","unfiltered":52}

in Grafana for this request I can see filtering ratio image

BUT in query-frontend stats I receive:

{
    "cache_chunk_bytes_fetched": 565966149,
    "cache_chunk_bytes_stored": 2374431723,
    "cache_chunk_download_time": "3.054977896s",
    "cache_chunk_hit": 324,
    "cache_chunk_req": 1650,
    "cache_index_download_time": "0s",
    "cache_index_hit": 0,
    "cache_index_req": 0,
    "cache_result_download_time": "3.811181ms",
    "cache_result_hit": 0,
    "cache_result_query_length_served": "0s",
    "cache_result_req": 3,
    "cache_stats_results_download_time": "938.715µs",
    "cache_stats_results_hit": 1,
    "cache_stats_results_req": 1,
    "cache_volume_results_download_time": "0s",
    "cache_volume_results_hit": 0,
    "cache_volume_results_req": 0,
    "caller": "metrics.go:217",
    "chunk_refs_fetch_time": "5.617988151s",
    "component": "frontend",
    "congestion_control_latency": "0s",
    "disable_pipeline_wrappers": "false",
    "duration": "1m32.10305443s",
    "end_delta": "28h1m13.381650377s",
    **"index_bloom_filter_ratio": "0.00",
    "index_post_bloom_filter_chunks": 0,**
    "index_shard_resolver_duration": "0s",
    "index_total_chunks": 0,
    "ingester_chunk_compressed_bytes": "0B",
    "ingester_chunk_decompressed_bytes": "0B",
    "ingester_chunk_downloaded": 0,
    "ingester_chunk_head_bytes": "0B",
    "ingester_chunk_matches": 0,
    "ingester_chunk_refs": 0,
    "ingester_post_filter_lines": 0,
    "ingester_requests": 0,
    "latency": "slow",
    "length": "15m0s",
    "level": "info",
    "limit": 1000,
    "lines_per_second": 69282,
    "org_id": "fake",
    "pipeline_wrapper_filtered_lines": 0,
    "post_filter_lines": 27328,
    "query": "{job=\"gcplog\"} |= ip(\"10.16.0.0/16\")",
    "query_hash": 3586219546,
    "query_referenced_structured_metadata": false,
    "query_type": "filter",
    "queue_time": "150.620996ms",
    "range_type": "range",
    "returned_lines": 0,
    "shards": 32,
    "splits": 1,
    "start_delta": "28h16m13.38165023s",
    "status": "200",
    "step": "200ms",
    "store_chunks_download_time": "1m4.154568429s",
    "throughput": "139MB",
    "total_bytes": "13GB",
    "total_bytes_structured_metadata": "75MB",
    "total_entries": 1000,
    "total_lines": 6381172,
    "ts": "2024-08-08T08:31:13.381664584Z"
}

and here I always see

"index_bloom_filter_ratio": "0.00",
"index_post_bloom_filter_chunks": 0,

I guess it's a bug of query-frontend, because basing on logs of index-gateway and bloom-gateway configuration is fine.

To Reproduce Steps to reproduce the behavior:

  1. Deploy Loki (3.1.0) in distributed mode
  2. Enable bloom filters
  3. Query: {your_label="blabla"} |= "any_long_string_with_a_lot_of_characters"
  4. Check query-frontend stats for this query in it's logs

Expected behavior Query-frontend should show in stats actual filter_ratio value and amount of chunks to process after filtering

Environment:

Screenshots, Promtail config, or terminal output attached in problem description

evenhh commented 3 weeks ago

The same thing happened to me