grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
22.74k stars 3.31k forks source link

Internal `__stream_shard__` label is visible to user #13095

Open z0rc opened 1 month ago

z0rc commented 1 month ago

Describe the bug https://grafana.com/docs/loki/latest/operations/automatic-stream-sharding/ is enabled by default. It creates label __stream_shard__ that is visible to user when searching.

To Reproduce Steps to reproduce the behavior:

  1. Loki 3.0.0, Grafana v10.4.1
  2. Open Grafana Explore, start building query
  3. Observe __stream_shard__ being offered in label suggestions. Also it's present in search result fields.

Expected behavior __stream_shard__ shouldn't be visible to user, it provides zero value when searching.

Environment:

Screenshots, Promtail config, or terminal output

Screenshot 2024-05-31 at 16 50 37 Screenshot 2024-06-01 at 16 52 26
DylanGuedes commented 1 month ago

Thank you for your report.

We had this question before; tl;dr is: we weren't strongly opinionated about keeping the label, but having it can be useful. Example:

z0rc commented 1 month ago

If you would like to know how big a given stream is

Stream shard count (number?) is kinda obscure metric for this. Can it be represented in proper prom metric?

If you would like to reduce your search space to have a faster query

For me it reads not as "faster", but "incomplete" at best, "incorrect" at worst. Isn't there better options to make search faster, like bloom filters?

Hiding it makes the feature more obscure

But keeping it places makes this visually inconsistent and overloaded with information that isn't needed.

Promtail documentation has this:

Labels starting with (two underscores) are internal labels. They usually come from dynamic sources like service discovery. Once relabeling is done, they are removed from the label set. To persist internal labels so they’re sent to Grafana Loki, rename them so they don’t start with .

Which I think applicable to __stream_shard__ too.

I understand your position from development perspective, but from end user I still consider this label as noise, which I want to reduce.

DylanGuedes commented 1 month ago

If you would like to know how big a given stream is

Stream shard count (number?) is kinda obscure metric for this. Can it be represented in proper prom metric?

Hardly, because it would have infinite cardinality.

If you would like to reduce your search space to have a faster query

For me it reads not as "faster", but "incomplete" at best, "incorrect" at worst. Isn't there better options to make search faster, like bloom filters?

Yes. I wasn't saying it was an ideal solution for that use case; but yeah, I've used it myself a few times. "Faster" wasn't the best word, but one use case I faced was quickly working around a timeout. I tried different time ranges, gave up and just forced a specific stream shard, which did the work.

Hiding it makes the feature more obscure

But keeping it places makes this visually inconsistent and overloaded with information that isn't needed.

Promtail documentation has this:

Labels starting with (two underscores) are internal labels. They usually come from dynamic sources like service discovery. Once relabeling is done, they are removed from the label set. To persist internal labels so they’re sent to Grafana Loki, rename them so they don’t start with .

Which I think applicable to __stream_shard__ too.

I understand your position from development perspective, but from end user I still consider this label as noise, which I want to reduce.

That is super fair, thanks for bringing it. As I said in my previous message, we weren't strongly opinionated about keeping it in the past, but we just left it as we didn't find strong opposition either. Now that 3.0 is released with this enabled by default we might see more and more users against it though. I'll add to my backlog to revive the discussion and maybe hide it in the near future.