elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.57k stars 24.62k forks source link

Configure `search.allow_expensive_queries` per role #53607

Open matriv opened 4 years ago

matriv commented 4 years ago

It would be a nice feature to be able to configure the search.allow_expensive_queries per role. This would provide the opportunity for example for super users to be able to run expensive queries but they will still be not permitted for normal users to do so.

elasticmachine commented 4 years ago

Pinging @elastic/es-search (:Search/Search)

elasticmachine commented 4 years ago

Pinging @elastic/es-security (:Security/Security)

bytebilly commented 4 years ago

I talked to @matriv and @giladgal to get more context around this request, reporting here so we can discuss further.

The concern is that Security and Observability use cases would have a mandatory requirement to run expensive queries for some of their tasks, forcing the new setting to stay disabled since it is global. The proposal allows to give "expensive" privileges to those tasks, and to limit other accesses.

I would avoid an ad-hoc solution that would not solve the underlying problem — per-user resource quota limits, unless there are strong implications that can make this an exception, for example strong impact on Solutions use cases.

Moving cluster settings into privileges, or using privileges to limit resource usage would require a more generic approach since it would be applied to many similar cases, with a consequent higher technical effort.

Could you help defining if Solutions would be able to use the current feature as cluster setting or not? This is a key point to me.

@sorantis @MikePaquette you may be interested to chime in here and add your point of view.

giladgal commented 4 years ago

To pick a concrete example, assuming many security and observability use cases must use options like wildcard queries or script score queries (or other queries mentioned in #51385, would this feature become more useful for security and observability if we enable allow_expensive_query per role? Alternatively would a more fine grain control be useful for security and observability (see 52337)? @sorantis @MikePaquette

tvernum commented 4 years ago

My concern is that the seems like a potential mismatch between the proposed solution (per role) and the purposes for needing this (supporting security & observability).

Most security & observability searches happen in real time inside Kibana. I am working on the assumption that the needed "expensive queries" fall into that category. If that's the case, then the queries run in the context of the user who is logged in to Kibana. So it means that if SIEM needs expensive queries, then every user who uses SIEM needs to be allowed to run expensive queries, all the time. Since the proposal is a per-role flag, it would mean it follows a user around in every context - so if you can run expensive queries in SIEM you can also run expensive queries in Kibana Dev Tools.

It sounds more like we need something that is per-app, not per-user/role. So you can run expensive queries inside SIEM but not inside Dev Tools.

Perhaps we've misunderstood the proposal, and if so please let us know, but it doesn't sounds like a per-role flag actually helps here.

Which is not to say that per-user resource limits are a bad thing - it's entirely reasonable to have a "power users" flag that would allow some people to run expensive queries and not others, but I don't think such a flag helps SIEM/ Observability very much.

giladgal commented 4 years ago

Having the limitation per user and app makes sense to me - these are complementing, not contradicting. I would think that in some situations only some of the users will get access to some of the expensive queries, so RBAC can be beneficial.

bytebilly commented 4 years ago

Here are some thoughts after @giladgal and I discussed a bit more about this topic.

Per-user limits: they are good if you want to limit "new users", and remove the limit when they are more experienced and you trust they will not kill the system with expensive queries.

The problem with per-user limits is that we don't expect users of a specific app to become knowledgeable about problems of expensive queries, even after they get onboarded. For example, SIEM users should be pro in threat hunting using the SIEM interface, but they don't need to know and focus on which are the queries that bring data there. You also cannot forbid new security specialists to use the SIEM interface on their first day, as this is a mandatory part of their job.

This would make necessary to guarantee the expensive query capability to all SIEM users. If we consider a standard use case, most of the users will be tied to a specific solution, and so they will need the permission to run expensive queries to do their job. Users that don't fall into "solution users" will be data and system admins, and those are not good candidates to be limited as we may expect they know what they are doing.

In conclusion, every user in the system will have the permission to run expensive queries, making the feature not really useful.

Per-app limits: they are good if you want to allow expensive queries in a "controlled" environment only (e.g. SIEM), where queries are not arbitrary and they are unlikely to harm the system.

In this case we can allow queries coming from solutions apps (like SIEM), as they don't allow users to unintentionally kill the system. If the user performs arbitrary expensive queries, for example via dev console or directly to the ES API endpoint, it will be denied. The problem here is that Elasticsearch needs a way to know which is the app that is performing the query. If the app uses users' credentials, this information is not available, and the check cannot be done as Elasticsearch will see it coming from the user.

We discussed the following approaches, that require more investigation.

Allow a fine-grained selection of expensive query types that can be allowed/forbidden (see the list here). This will work only if apps (like SIEM) use a few types, so we can block everything else. Complexity increases, and it would not fully solve the problem as users would be able to use those types of expensive queries from other places like dev tools.

Perform a combination of per-user and per-app checks. This works if each app has its own "system user" (or API key), and all the queries are performed in a way that credentials for both the system user and the interactive user are sent to Elasticsearch. It could then validate that the request is coming from a specific app (via system user credentials) and for a specific user (via interactive user credentials). The app will be authorized to perform expensive queries with the privileges of the user, in order to guarantee proper data access.

This approach requires a very specific setup for how apps perform queries, and I suspect that it cannot be done with the current implementation and would require more thoughts.

The next step to better define the requirements is to check with Solutions:

  1. which kind of expensive queries they leverage
  2. which are the credentials that are used to execute those queries

Feel free to comment/add/validate if I've missed something.

bytebilly commented 4 years ago

Looking at this problem from a different angle: if we assume that we want to limit "unintended misuses" of expensive queries only, instead of "malicious users", we can probably come to a simpler solution.

We can see if Kibana feature control can help. For example, allow/block expensive queries in Discovery and Dev Tools based on the user, allow for all in SIEM.

Kibana can then set an additional header to the REST call to Elasticsearch to tell if the expensive query protection should be done or not for that specific request. The same approach could be used by any other client (e.g. Solutions or custom apps).

This will not guarantee users will not do expensive queries by adding the required header, but as said before it would solve the "unintended misuse" scenario pretty well.

bytebilly commented 4 years ago

@arisonl could you take a look and give some feedback on the feasibility of the proposal on the Kibana side? We still don't know if we need to do it, but a preliminary check may be helpful.

Thanks!

mbarretta commented 3 years ago

In the context of runtime fields, per-role enablement of expensive queries (or heck, just the ability to define runtime fields at all) is an important feature in many orgs where the users are not the cluster owners.

The "unintended misuse" and "uneducated misuse" (i.e. someone without enough knowledge of ES and data query in general to knowingly intend or unknowingly not intend an expensive query) are questions I'm getting often w.r.t. just regular Kibana querying (searching for *root* over last 5 years in a +1PB cluster).

big ++ on per role permission for expensive queries.

javanna commented 3 years ago

@mbarretta are you after preventing some users from defining runtime fields, I would assume in the search request as well as in the index mappings, or preventing some users from using runtime fields in their queries/aggs etc as part of the search API? It is subtle, but leaning on role permissions for expensive queries may end up addressing the latter but not the former.

mbarretta commented 3 years ago

@javanna In general, the "uneducated misuse" folk that I'm describing will never define runtime fields using an API. It's all about shutting them out from wherever it's possible within Kibana that isn't already controlled by RBAC (index settings for example). So maybe my comment is better added to a Kibana ticket that would hide/disable things, though would expect they'll need privileges on which to makes those decisions.

droberts195 commented 3 years ago

One more related requirement is the ability to use runtime fields when searching system indices. At the moment we cannot do this, because then our internal functionality would then break if somebody sets search.allow_expensive_queries: false.

With system indices there is some protection against the "expensive" aspect because we can ensure that we search indexed fields to narrow down the set of documents that runtime fields need to be calculated for to just a handful.

Many of the proposed remedies in this issue could meet this requirement, e.g. special header on the request, new privilege, per-app limits. But please can you make sure whichever one is eventually chosen also works for the use case of internal searches done from within Elasticsearch features always being able to do "expensive" queries.

kobelb commented 2 years ago

The cluster-wide search.allow_expensive_queries is causing issues for Kibana. For example, https://github.com/elastic/kibana/issues/111031. If we were to keep looking, I think we'd uncover additional Kibana features that have been broken by the introduction of this setting.

If we had the ability to allow the kibana_system user the ability to execute expensive queries, this would solve our issues when users have enabled authc/authz. However, when users aren't using authc/authz, this will continue to break Kibana features.

bytebilly commented 2 years ago

Is there anything we can do to identify those queries and remove them from the list of what's considered "expensive"? Are those expensive (but required), or just falling in that category even if cheap?

We discussed about resource limits in the past, but this seems a compelling problem that should be solved soon.

kobelb commented 2 years ago

@bytebilly the situations that I'm aware of where Kibana is using "expensive" queries is whenever Kibana uses a scripted query. It's theoretically possible for Kibana to no longer use scripted ES queries; however, Kibana would be stuck implementing the same logic in Kibana code, which would just make the operation take longer because of network latency.

heipei commented 2 years ago

Somewhat related: Is there an option to set the option per request? I'm exposing a query-string query via a HTTP API and it would be helpful if I could set the field myself for each request (because some users should be allowed to run expensive queries).

droberts195 commented 1 year ago

People who commented on this issue may also wish to comment on https://github.com/elastic/elasticsearch/issues/90898

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-security (Team:Security)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

syepes commented 4 months ago

+1

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)