grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
22.76k stars 3.31k forks source link

No blooms created #12751

Open rlex opened 2 months ago

rlex commented 2 months ago

Describe the bug Cannot get blooms to work. Components (bloom-gateway, bloom-compactor) starts, but no blooms get created.

To Reproduce I'm trying to get bloom gateway / compactor to work. I set up loki in distributed mode from new helm chart (ver 6.3.4, migrated from loki-distributed), enabled bloom and gateway pods, and added configuration to loki config according to docs:

            bloom_compactor:
              enabled: true
            bloom_gateway:
              enabled: true
              client:
                addresses: dnssrvnoa+_grpc._tcp.{{`{{ template "loki.fullname" $ }}`}}-bloom-gateway-headless.{{`{{ $.Release.Namespace }}`}}.svc.cluster.local
            limits_config:
              bloom_gateway_enable_filtering: true
              bloom_compactor_enable_compaction: true 

Btw, there seems to be small error in docs, it states you need to point gateway client to addresses: dnssrvnoa+_bloom-gateway-grpc._tcp.bloom-gateway-headless..svc.cluster.local but helm chart have port named just "grpc", so proper line should be addresses: dnssrvnoa+_grpc._tcp.bloom-gateway-headless..svc.cluster.local ? Anyway, after all setup it seems it doesn't generate blooms, as log states level=debug ts=2024-04-23T09:17:55.470493459Z caller=blockscache.go:439 component=bloomstore msg="evict expired entries" curr_size_bytes=0 soft_limit_bytes=1073741824 hard_limit_bytes=2147483648 Any pointers on what i did wrong?

Just in case, here is schema:

            schema_config:
              configs:
                - from: "2023-06-07"
                  store: tsdb
                  object_store: s3
                  schema: v12
                  index:
                    prefix: tsdb_index_
                    period: 24h
                - from: "2024-04-22"
                  store: tsdb
                  object_store: s3
                  schema: v13
                  index:
                    prefix: index_
                    period: 24h

and some additional bloom tuning (just for test, i cropped max sizes):

            storage_config:
              bloom_shipper:
                blocks_cache:
                  soft_limit: 1GiB
                  hard_limit: 2GiB

Expected behavior I should see blooms usage from bloomstore log lines

Environment:

Screenshots, Promtail config, or terminal output I also tried to enable tracing, but i don't see any traces from bloom components.

mzupan commented 2 months ago

i found you need to add the config options here for it to work in SSD/Distributed. if you don't touch the loki.config at all

    structuredConfig:
      bloom_compactor:
        enabled: true

      bloom_gateway:
        enabled: true
        client:
          addresses: dns+loki-backend-headless.loki.svc.cluster.local:9095
rlex commented 2 months ago

Yes, I added that block in loki.structuredConfig, so it's present in generated configMap

rlex commented 2 months ago

The only activity related to blooms i see in metrics is

loki_bloom_gateway_inflight_tasks_count

Constantly rising

anosulchik commented 2 months ago

I have exactly the same problem with Loki configured in SimpleDeployable mode where bloom-compator and backend-gateway runs at backend component.. No signs that compactor actually creates blooms - /var/loki/data/blooms directory in the pod is always empty. We have steady stream of events into loki and tsdb chunks are created just ok, but not bloom chunks. How can I debug it?

rlex commented 2 months ago

@anosulchik what is your storage schema version btw? v12? v13?

anosulchik commented 2 months ago

@rlex I had to bump it up to v13 since I'm upgrading to loki 3 and it doesn't run with v12.

anosulchik commented 2 months ago

I started to think that bloom compactor's retention is disabled by default, so mb it forms blooms in memory but doesn't persists them or something like that. On the other hand, there's no signs that any blooms were ever created in my setup per /metrics values.

mxssl commented 2 months ago

The same problem on chart 6.3.4, distributed mode

chaudum commented 2 months ago

[!NOTE] Bloom filters are an experimental feature and are subject to breaking changes.

Hi @rlex

Thanks for your efforts and trying out bloom filters! To start with, I would focus on getting bloom filters built by the bloom compactor.

Bloom compactors generate the bloom blocks. These are a container for bloom filters of multiple streams for a single day for a single tenant. The bloom generation is run in an interval of -bloom-compactor.compaction-interval (default 6h). They start with the oldest data at -bloom-compactor.max-table-offset and generate blocks to -bloom-compactor.min-table-offset.

When the compactor is running and you don't see any error logs from the process, you should starting seeing bloom blocks and bloom metas in your object storage (S3):

bloom/{table}/{tenant}/metas/{minFp}-{maxFp}-{checksum}.json
bloom/{table}/{tenant}/blocks/{minFp}-{maxFp}/{from}-{through}-{checksum}.tar.gz

If that is not the case, can you provide a comprehensive amount of logs from the bloom compactor at level debug?

wiardvanrij commented 2 months ago

Hey @chaudum, since I've been in a similar situation, let me share my perspective.

Firstly, I understand your point about it being an experimental feature. We're simply eager to explore and make the most of it, much like Grafana seems to be keen on promoting the feature. All in the same boat! :)

I think the primary issue here is observability. While there are plenty of metrics available, none of them seem to indicate any active progress. I've checked the /bloom directory and monitored pod activity (CPU and memory usage), and it's doing "something" - but none of the metrics nor queries indicate that. This lack of visibility is some hurdle; it's difficult to assess how well it's performing in terms of keeping up with demands. It looks like it's not doing anything. Which might be 'false' (at least in my case ;) )

I've made calculations based on 4MB/s per CPU core, and I've intentionally pushed my compaction process beyond its usual limits to handle ten times my usual ingest rate. However, despite this, none of my queries appear to be utilizing the bloom logic (even though they're suitable for it), and I'm struggling to understand why.

So, maybe out of scope here but purely giving some suggestions/feedback;

p.s. small nit:

-bloom-compactor.compaction-interval (default 6h).

Isn't this 10m?

wiardvanrij commented 2 months ago

One 'fix' (or better yet, read the config and set it right) was to change: https://grafana.com/docs/loki/latest/configure/#storage_config

bloom_shipper:
  # Working directory to store downloaded bloom blocks. Supports multiple
  # directories, separated by comma.
  # CLI flag: -bloom.shipper.working-directory
  [working_directory: <string> | default = "/data/blooms"]

The helmchart defaults to /var/loki/blooms - so changing the working directly there fixed some of my things.

chaudum commented 2 months ago

@wiardvanrij That is very valuable feedback, observability of the bloom generation process is indeed not great. We aim for improving that. I can also say already that the model of generating blocks will change in the future, since we want to get rid of the ring and replace it with a job based model to assign tasks to "worker compactors".

At the moment, we use the avg/quantile of the loki_bloom_compactor_progress:

## avg
sum(loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"})
/
sum(count(loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}))

## p90
quantile(
    0.9, 
    sum by (pod) (
        loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}
    )
)

## p10
quantile(
    0.1, 
    sum by (pod) (
        loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}
    )
)

As well as a query that compares rate of data ingested (stored on object storage) with the rate of data processed by the bloom compactor and the CPU cores used, to determine the required CPU cores to not fall behind ingest.

# Conceptually, the formula is:
# (bytes_ingested * space_amplification / bloom_bytes_processed_per_core)

sum(rate(loki_distributor_bytes_received_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
*
(
  3 * # replication factor
  sum(
    1 - 
    sum(rate(loki_chunk_store_deduped_chunks_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
    /
    sum(rate(loki_ingester_chunks_flushed_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
  )
)
/
(
    sum(rate(loki_bloomcompactor_chunk_series_size_sum{cluster="ops-us-east-0", namespace="loki-ops", container="bloom-compactor"}[$__rate_interval]))
    /
    sum(rate(container_cpu_usage_seconds_total{cluster="ops-us-east-0", namespace="loki-ops", container="bloom-compactor"}[$__rate_interval]))
)
salvacorts commented 2 months ago

@wiardvanrij we just merged (https://github.com/grafana/loki/pull/12855) adding some dashboards for both the bloom compactor and gateways.

As @chaudum mentioned, we will make significant changes to how compactors operate in the upcoming months and will likely change these dashboards but hopefully you and other folks running the feature find them useful.

mac133k commented 1 month ago

Can someone please clarify whether Bloom filters suppose to work with storage schemas versions lower than 13?

mzupan commented 1 month ago

Can someone please clarify whether Bloom filters suppose to work with storage schemas versions lower than 13?

No you need v13

algo7 commented 3 weeks ago

Note

Bloom filters are an experimental feature and are subject to breaking changes.

Hi @rlex

Thanks for your efforts and trying out bloom filters! To start with, I would focus on getting bloom filters built by the bloom compactor.

Bloom compactors generate the bloom blocks. These are a container for bloom filters of multiple streams for a single day for a single tenant. The bloom generation is run in an interval of -bloom-compactor.compaction-interval (default 6h). They start with the oldest data at -bloom-compactor.max-table-offset and generate blocks to -bloom-compactor.min-table-offset.

When the compactor is running and you don't see any error logs from the process, you should starting seeing bloom blocks and bloom metas in your object storage (S3):

bloom/{table}/{tenant}/metas/{minFp}-{maxFp}-{checksum}.json
bloom/{table}/{tenant}/blocks/{minFp}-{maxFp}/{from}-{through}-{checksum}.tar.gz

If that is not the case, can you provide a comprehensive amount of logs from the bloom compactor at level debug?

Seems like the default Bloom compaction interval is 10m instead of 6h: https://grafana.com/docs/loki/latest/configure/#bloom_compactor

rdharani19 commented 3 weeks ago

We are also experiencing the same issue. We switched from the loki-distributed chart to the new official loki chart with the mode set as distributed and enabled bloom filters. However, we are not seeing any blooms being created. Dont see any errors either, (V13 and loki3)

@wiardvanrij what tweaks did you make to the path, to make it work

lystor commented 1 week ago

Same problem on grafana/loki:3.0.0 Bloom doesn't work in SSD mode.