grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
24.04k stars 3.47k forks source link

No blooms created when running Loki with Helm chart #12751

Open rlex opened 7 months ago

rlex commented 7 months ago

Describe the bug Cannot get blooms to work. Components (bloom-gateway, bloom-compactor) starts, but no blooms get created.

To Reproduce I'm trying to get bloom gateway / compactor to work. I set up loki in distributed mode from new helm chart (ver 6.3.4, migrated from loki-distributed), enabled bloom and gateway pods, and added configuration to loki config according to docs:

            bloom_compactor:
              enabled: true
            bloom_gateway:
              enabled: true
              client:
                addresses: dnssrvnoa+_grpc._tcp.{{`{{ template "loki.fullname" $ }}`}}-bloom-gateway-headless.{{`{{ $.Release.Namespace }}`}}.svc.cluster.local
            limits_config:
              bloom_gateway_enable_filtering: true
              bloom_compactor_enable_compaction: true 

Btw, there seems to be small error in docs, it states you need to point gateway client to addresses: dnssrvnoa+_bloom-gateway-grpc._tcp.bloom-gateway-headless..svc.cluster.local but helm chart have port named just "grpc", so proper line should be addresses: dnssrvnoa+_grpc._tcp.bloom-gateway-headless..svc.cluster.local ? Anyway, after all setup it seems it doesn't generate blooms, as log states level=debug ts=2024-04-23T09:17:55.470493459Z caller=blockscache.go:439 component=bloomstore msg="evict expired entries" curr_size_bytes=0 soft_limit_bytes=1073741824 hard_limit_bytes=2147483648 Any pointers on what i did wrong?

Just in case, here is schema:

            schema_config:
              configs:
                - from: "2023-06-07"
                  store: tsdb
                  object_store: s3
                  schema: v12
                  index:
                    prefix: tsdb_index_
                    period: 24h
                - from: "2024-04-22"
                  store: tsdb
                  object_store: s3
                  schema: v13
                  index:
                    prefix: index_
                    period: 24h

and some additional bloom tuning (just for test, i cropped max sizes):

            storage_config:
              bloom_shipper:
                blocks_cache:
                  soft_limit: 1GiB
                  hard_limit: 2GiB

Expected behavior I should see blooms usage from bloomstore log lines

Environment:

Screenshots, Promtail config, or terminal output I also tried to enable tracing, but i don't see any traces from bloom components.

mzupan commented 7 months ago

i found you need to add the config options here for it to work in SSD/Distributed. if you don't touch the loki.config at all

    structuredConfig:
      bloom_compactor:
        enabled: true

      bloom_gateway:
        enabled: true
        client:
          addresses: dns+loki-backend-headless.loki.svc.cluster.local:9095
rlex commented 7 months ago

Yes, I added that block in loki.structuredConfig, so it's present in generated configMap

rlex commented 7 months ago

The only activity related to blooms i see in metrics is

loki_bloom_gateway_inflight_tasks_count

Constantly rising

anosulchik commented 7 months ago

I have exactly the same problem with Loki configured in SimpleDeployable mode where bloom-compator and backend-gateway runs at backend component.. No signs that compactor actually creates blooms - /var/loki/data/blooms directory in the pod is always empty. We have steady stream of events into loki and tsdb chunks are created just ok, but not bloom chunks. How can I debug it?

rlex commented 7 months ago

@anosulchik what is your storage schema version btw? v12? v13?

anosulchik commented 7 months ago

@rlex I had to bump it up to v13 since I'm upgrading to loki 3 and it doesn't run with v12.

anosulchik commented 7 months ago

I started to think that bloom compactor's retention is disabled by default, so mb it forms blooms in memory but doesn't persists them or something like that. On the other hand, there's no signs that any blooms were ever created in my setup per /metrics values.

mxssl commented 7 months ago

The same problem on chart 6.3.4, distributed mode

chaudum commented 7 months ago

[!NOTE] Bloom filters are an experimental feature and are subject to breaking changes.

Hi @rlex

Thanks for your efforts and trying out bloom filters! To start with, I would focus on getting bloom filters built by the bloom compactor.

Bloom compactors generate the bloom blocks. These are a container for bloom filters of multiple streams for a single day for a single tenant. The bloom generation is run in an interval of -bloom-compactor.compaction-interval (default 6h). They start with the oldest data at -bloom-compactor.max-table-offset and generate blocks to -bloom-compactor.min-table-offset.

When the compactor is running and you don't see any error logs from the process, you should starting seeing bloom blocks and bloom metas in your object storage (S3):

bloom/{table}/{tenant}/metas/{minFp}-{maxFp}-{checksum}.json
bloom/{table}/{tenant}/blocks/{minFp}-{maxFp}/{from}-{through}-{checksum}.tar.gz

If that is not the case, can you provide a comprehensive amount of logs from the bloom compactor at level debug?

wiardvanrij commented 7 months ago

Hey @chaudum, since I've been in a similar situation, let me share my perspective.

Firstly, I understand your point about it being an experimental feature. We're simply eager to explore and make the most of it, much like Grafana seems to be keen on promoting the feature. All in the same boat! :)

I think the primary issue here is observability. While there are plenty of metrics available, none of them seem to indicate any active progress. I've checked the /bloom directory and monitored pod activity (CPU and memory usage), and it's doing "something" - but none of the metrics nor queries indicate that. This lack of visibility is some hurdle; it's difficult to assess how well it's performing in terms of keeping up with demands. It looks like it's not doing anything. Which might be 'false' (at least in my case ;) )

I've made calculations based on 4MB/s per CPU core, and I've intentionally pushed my compaction process beyond its usual limits to handle ten times my usual ingest rate. However, despite this, none of my queries appear to be utilizing the bloom logic (even though they're suitable for it), and I'm struggling to understand why.

So, maybe out of scope here but purely giving some suggestions/feedback;

p.s. small nit:

-bloom-compactor.compaction-interval (default 6h).

Isn't this 10m?

wiardvanrij commented 7 months ago

One 'fix' (or better yet, read the config and set it right) was to change: https://grafana.com/docs/loki/latest/configure/#storage_config

bloom_shipper:
  # Working directory to store downloaded bloom blocks. Supports multiple
  # directories, separated by comma.
  # CLI flag: -bloom.shipper.working-directory
  [working_directory: <string> | default = "/data/blooms"]

The helmchart defaults to /var/loki/blooms - so changing the working directly there fixed some of my things.

chaudum commented 7 months ago

@wiardvanrij That is very valuable feedback, observability of the bloom generation process is indeed not great. We aim for improving that. I can also say already that the model of generating blocks will change in the future, since we want to get rid of the ring and replace it with a job based model to assign tasks to "worker compactors".

At the moment, we use the avg/quantile of the loki_bloom_compactor_progress:

## avg
sum(loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"})
/
sum(count(loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}))

## p90
quantile(
    0.9, 
    sum by (pod) (
        loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}
    )
)

## p10
quantile(
    0.1, 
    sum by (pod) (
        loki_bloomcompactor_progress{cluster=~"$cluster", job=~"$namespace/bloom-compactor"}
    )
)

As well as a query that compares rate of data ingested (stored on object storage) with the rate of data processed by the bloom compactor and the CPU cores used, to determine the required CPU cores to not fall behind ingest.

# Conceptually, the formula is:
# (bytes_ingested * space_amplification / bloom_bytes_processed_per_core)

sum(rate(loki_distributor_bytes_received_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
*
(
  3 * # replication factor
  sum(
    1 - 
    sum(rate(loki_chunk_store_deduped_chunks_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
    /
    sum(rate(loki_ingester_chunks_flushed_total{cluster="ops-us-east-0", namespace="loki-ops"}[$__rate_interval]))
  )
)
/
(
    sum(rate(loki_bloomcompactor_chunk_series_size_sum{cluster="ops-us-east-0", namespace="loki-ops", container="bloom-compactor"}[$__rate_interval]))
    /
    sum(rate(container_cpu_usage_seconds_total{cluster="ops-us-east-0", namespace="loki-ops", container="bloom-compactor"}[$__rate_interval]))
)
salvacorts commented 7 months ago

@wiardvanrij we just merged (https://github.com/grafana/loki/pull/12855) adding some dashboards for both the bloom compactor and gateways.

As @chaudum mentioned, we will make significant changes to how compactors operate in the upcoming months and will likely change these dashboards but hopefully you and other folks running the feature find them useful.

mac133k commented 6 months ago

Can someone please clarify whether Bloom filters suppose to work with storage schemas versions lower than 13?

mzupan commented 6 months ago

Can someone please clarify whether Bloom filters suppose to work with storage schemas versions lower than 13?

No you need v13

algo7 commented 5 months ago

Note

Bloom filters are an experimental feature and are subject to breaking changes.

Hi @rlex

Thanks for your efforts and trying out bloom filters! To start with, I would focus on getting bloom filters built by the bloom compactor.

Bloom compactors generate the bloom blocks. These are a container for bloom filters of multiple streams for a single day for a single tenant. The bloom generation is run in an interval of -bloom-compactor.compaction-interval (default 6h). They start with the oldest data at -bloom-compactor.max-table-offset and generate blocks to -bloom-compactor.min-table-offset.

When the compactor is running and you don't see any error logs from the process, you should starting seeing bloom blocks and bloom metas in your object storage (S3):

bloom/{table}/{tenant}/metas/{minFp}-{maxFp}-{checksum}.json
bloom/{table}/{tenant}/blocks/{minFp}-{maxFp}/{from}-{through}-{checksum}.tar.gz

If that is not the case, can you provide a comprehensive amount of logs from the bloom compactor at level debug?

Seems like the default Bloom compaction interval is 10m instead of 6h: https://grafana.com/docs/loki/latest/configure/#bloom_compactor

rdharani19 commented 5 months ago

We are also experiencing the same issue. We switched from the loki-distributed chart to the new official loki chart with the mode set as distributed and enabled bloom filters. However, we are not seeing any blooms being created. Dont see any errors either, (V13 and loki3)

@wiardvanrij what tweaks did you make to the path, to make it work

lystor commented 5 months ago

Same problem on grafana/loki:3.0.0 Bloom doesn't work in SSD mode.

diranged commented 4 months ago

Chiming in here that this thread was helpful - I think I have the bloom-compactor working .. but near as I can tell the bloom-gateways are never used. I never see a single TCP connection go to them, even though I have the bloom_gateway configuration setup properly and I have verified that the SRV records indeed point to the gateways.

QuentinBisson commented 4 months ago

I got them running in SSD mode using the following config in the loki helm chart:

loki:
  structuredConfig:
    bloom_compactor:
      enabled: true
        retention:
          enabled: true
    bloom_gateway:
      enabled: true
        client:
          addresses: dns+loki-backend-headless.loki.svc.cluster.local:9095
  limits_config:
    bloom_gateway_enable_filtering: true
    bloom_compactor_enable_compaction: true
chaudum commented 4 months ago

[!NOTE] Bloom filters are a feature that can accelerate certain types of queries for large-scale (>150TB/month) log volume. The reason for that is that it also comes with a relatively high cost for building and querying the bloom filters.

I think this Github issue mixes two separate issues:

  1. The initial problem of this issue: the bloom compactors not building any blocks in microservice (distributed) mode even when supposedly configured correctly. This is more an observability issue of the bloom compactors.
  2. Default Helm chart settings not working for bloom compactors and bloom gateways.

To what I can see, both issues have been solved with the one or other comment in this thread.

Wrt supporting bloom building and querying in the Helm charts, both SSD and distributed mode: During development of the bloom components the main focus is on making it work microservice mode outside of the Helm bundling (we use jsonnet for that). As much as we love to see them working out of the box in the Helm chart, it's often a matter of bandwidth and priorities. Thankfully @QuentinBisson stepped up and provided a pull request for the Helm configuration https://github.com/grafana/loki/pull/13556

A reason why we did not provide a Helm chart with bloom filters for SSD mode yet is, that if you are running SSD (or single binary) you are likely not have the problem that bloom filters aim to solve. Not saying that it's not possible, though: ingesting 100s of TB/month and performing mostly needle-in-the-haystack queries across very broad label selectors.

It's great, however, that the community wants to try out new features out of curiosity!

What I want is to manage expectations, though. Since, building blooms and querying them is both CPU and memory expensive, it can and will interfere with other components when run together in the backend target in SSD mode.

I also read from the comments in this thread that we need to improve the documentation of the bloom filter feature. I hope once the development stabilise, it's also easier to improve documentation. PRs of any size are always welcome.

chaudum commented 4 months ago

One other thing I explicitly left out of the last comment, because it deserves separate attention:

In the past weeks, we have been working on improving the building process of bloom filters. The bloom compactor is replaced by the bloom planner and bloom builder components. That allowed us to remove the complex and hard to operate ring from the compactor and replace it with a task queue and worker setup where the workers can be scaled independently of the task creation.

The configuration block is named bloom_build:

bloom_build:
  enabled: true|false
  planner:
    ...
  builder:
    ...

Run Loki with targets -target=bloom-planner and -target=bloom-builder. Unfortunately, there is not much more documentation available yet.