grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.53k stars 3.4k forks source link

Wildcard for X-Scope-OrgID when querying #7356

Open noamApps opened 2 years ago

noamApps commented 2 years ago

Is your feature request related to a problem? Please describe.

Currently, multi-tenancy in Loki enforces users to keep the X-Scope-OrgID updated manually. For example, when using Grafana, each time a new tenant is introduce, users must either:

In my specific use-case I used multi-tenancy in the first place because I couldn't inject a label on-the-fly using the Push API URI. VictoriaMetrics for example allow this behavior with the extra_labels uri parameter available in the metrics push endpoint. (regardless of this feature request, I think that adding the ability to inject labels on the receiving side using uri parameters is important feature, as there are pieces of information that just can't be delegated to client-side agents for different reasons, but mainly lack of context and security)

I think that in use cases such edge-computing/dynamic environments/etc it is not feasible to keep track of the X-Scope-OrgID on the querying side.

Describe the solution you'd like When Configuring Loki as datasource / passing the X-Scope-OrgId, I will be able to set it to *, which will allow to put any tenant possible in __tenant_id__ parameter when querying Loki

Describe alternatives you've considered The only alternative I came up with is setting up a webhook that Grafana will report to once a new tenant is recognized (using the other datasources I have). But this is a complicated workaround and is Grafana specific.

reda-mattar commented 1 year ago

is there any news on this ? It is also a major requirement on my side, with a tenant list growing each week.

noamApps commented 1 year ago

Hey @reda-mattar and everyone else interested in the solution, we ended up building our own standalone middleware that converts tenants to labels, its open source as well - https://github.com/groundcover-com/loki-proxy

We use it and it solved the issue for us - check it out, we'll be happy to know it is useful for others as well

johnbuluba commented 1 year ago

Hello,

We would be very interested in this feature. Are there any plans for supporting wildcards in the X-Scope-OrgID header?

winem commented 1 year ago

Hi,

this would be a priceless feature for our logging and make us way more flexible. Both would be useful, the X-Scope-OrgID set to * as well as support for trailing wildcards to have something like somePrefix*.

I'm just wondering if there are any security concerns or if there is any historical reason why wildcards aren't supported yet. If not, it should be ready to be picked up by any volunteer.

honganan commented 1 year ago

May be we can list object storage directory to fetch all possible tenants, and then filter them by X-Scope-OrgID wildcard. At this situation, Frontend can use the wildcard string immediately to identify a queue. I am not sure if there are any side effect. Welcome discussion.

jeschkies commented 1 year ago

Well, Grafana Enterprise Logs does support * as the tenant and infers the tenants from the authentication. There's probably little chance that we are going to open source it.

set it to *, which will allow to put any tenant possible in __tenant_id__ parameter when querying Loki

The issue with that is the "any tenant possible". Do you mean any tenant in Loki? I think this heavily depends on you authentication layer.

thmshmm commented 1 year ago

One use case for this could be to have tenants for the purpose of applying specific limits but still allowing one data source to access all tenants without the need of maintaining a long list of all known tenants. In that case it '*' would mean any tenant that Loki is aware of.

jeschkies commented 1 year ago

In that case it '*' would mean any tenant that Loki is aware of.

The issue is that Loki has no database on the tenants. The tenant ID can be anything any time.

winem commented 1 year ago

In that case it '*' would mean any tenant that Loki is aware of.

The issue is that Loki has no database on the tenants. The tenant ID can be anything any time.

Thanks for that reply! That's super helpful to understand the actual challenge.

On another topic we thought about having a proxy as Loki sidecars that handles the authentication for external requests. The proxy would basically take the credentials and set the X-Scope-OrgId header based on that. So having a special set of credentials that's allowed to see all logs of all tenants could be an alternative solution but comes with the limitation of the max header size if we're talking about hundreds or thousands of tenants in large environments. I'll see if I can come up with any generic approach that could help the community as well.

I guess the proxy handling the authentication and setting the header is similar to how it works for Grafana Enterprise Logs, @jeschkies ?

thmshmm commented 1 year ago

The proxy could make multiple requests containing groups/single tenants to mitigate the header length restriction and merge the responses. That would have some overhead.

jeschkies commented 1 year ago

I guess the proxy handling the authentication and setting the header is similar to how it works for Grafana Enterprise Logs, @jeschkies ?

Grafana Enterprise Logs is an extended build of Loki that introduces an HTTP layer that does exactly what your proxy does. However, by embedding everything we don't run into the header length issue.

Your proxy solution seems right. Could you merge some tenants into one?

winem commented 1 year ago

Hi @jeschkies, merging them is (as far as I know now) not an option because we really just use tenants when we want to ensure that the logs are visible to a specific audience (i.e. developers, administrators, ...) only and most of them should be able to edit and create dashboards. So it's mandatory to separate the logs in the datasource already.

But I guess this is too much off-topic for this thread. Please feel free to reach out to me (Marcel Weinberg) in Slack. I#d appreciate it!

diranged commented 3 months ago

So just chiming in some support here ... in our environment we want to separate out the logs so that each cluster is its own tenant.. we have dozens of clusters. For our developers sake though we do not want them to have to know the tenant ahead of time, nor of course do we want to manage hundreds of datasources.

It seems quite sane to offer a * option for this header that allows Loki to just search all the tenants. Is there a technical reason why this might be a challenge, or is it purely philosophical?

TheMatrix97 commented 3 months ago

In my honest opinion this should be implemented... There are some use cases where you might want to allow full access to a given administrator user providing him access to a datasource with a simple *. Not sure about the technical implications this change implies...

lczupryn-tibco commented 1 month ago

+1

jeschkies commented 1 month ago

Is there a technical reason why this might be a challenge, or is it purely philosophical?

@diranged, yes. As funny as it sounds Loki has no index of all tenants. IIRC a tenant is just a prefix. So Loki would have to scan a whole bucket.