elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.71k stars 8.13k forks source link

Alerting feature controls #43994

Closed mikecote closed 3 years ago

mikecote commented 5 years ago

Feature controls is designed to give a role access to features within Kibana. The access to a feature can be none, read or all at this time. This would then apply to alerts (none, read or all) and actions (none, read or all).

The problem we encounter is alerting spans different applications within Kibana that themselves may be set to none. When set to none, the user shouldn't see the alerts for the disabled application.

There is an issue on the security team roadmap to allow implementing sub features (#35616). Until it's implemented, I have opened a discuss issue to see what we should do until then (hopefully in a migratable way to it, if anything).

Example

Uptime: None Stack Monitoring: All Alerts: All Actions: All

As you can see, the user shouldn't be able to see Uptime alerts. With the current implementation, they would.

Possible solutions

  1. We do nothing until #35616 is implemented. This would allow users to see and edit alerts for applications they don't have access to.

  2. Split alert saved objects by solution (Ex: uptime_alert, stack_monitoring_alert, etc). This would leave actions as a generic group but alerts would then automatically get enabled by the application whenever enabled. (Ex: turning read access to Uptime would give you read access to uptime_alert as well). The downside of this is we'll have to do the same for history, like uptime_alert_history.

  3. Other solution??

cc @elastic/kibana-stack-services @elastic/kibana-security @alexfrancoeur @peterschretlen

bmcconaghy commented 5 years ago

1 seems better to me as it would involve less ongoing work, but it also might take a while to land.

kobelb commented 5 years ago

Ignoring the technical implementation and what's currently possible, which of the following would be the ideal use experience:

1) Uptime and Stack Monitoring themselves have a sub-feature of Alerting. 2) The Alerting feature has sub-features for Uptime and Stack Monitoring. 3) Both of the above.

If we pursue option 1, when the user is granted "All" access to Uptime and Stack Monitoring, it'd make the most sense for users to automatically get the Alerting sub-feature and be able to create and view alerts for the specific feature.

Where-as with option 2, when the user is grated "All" access to Alerting, they'd be able to use Alerts in every application.

There's also an option 3 where we support both. This gets somewhat confusing, as roles and privileges in the ES model are additive. So if the user is granted "All" access to Stack Monitoring and "None" access to Alerting, they'd still be able to use Stack Monitoring alerts.

mikecote commented 5 years ago

In regards to @kobelb's suggestions, I was thinking option 1.

Could this get interesting when determining if we show the alerting app or not? Since there won't be a single privilege to look for? Unless we can do an or between privileges to give access to the alerting UI.

legrego commented 5 years ago

Could this get interesting when determining if we show the alerting app or not? Since there won't be a single privilege to look for?

Is it possible that the alerting app is its own feature, distinct from creating in-app alerts themselves? Similar to how users create various saved objects within the discover, visualize, and dashboard applications today, but there is a separate "Saved Objects Management" feature which exposes that management app to end users.

mikecote commented 5 years ago

@legrego The alerting app would be the same scenario so we could do it that way. That resolves my concern.

kobelb commented 5 years ago

I don't know if we should be mirroring the "Saved Object Management" experience entirely.

If we want for users to be able to create alerts from within Uptime when they're given the Uptime "All" privilege, I assume that these alerts themselves will then show-up within the "alerting app" itself. If we always treat "Alerting" as a sub-feature of the applications in which it's integrated, are there additional privileges which need to be granted. If so, what are they?

mikecote commented 5 years ago

I can't think of any extra privileges that would have to be granted. A new privilege to "show" the alerting UI would pretty much be it unless we can compute the capability.

Also, I guess alerting wouldn't be much of a feature for users without read access to actions πŸ€”

kobelb commented 5 years ago

I can't think of any extra privileges that would have to be granted. A new privilege to "show" the alerting UI would pretty much be it unless we can compute the capability.

We can derive this from the sub-feature privileges. If the user has any access to alerting, we can then display the alerting application.

Is there a generic "create user defined alert" which we'll want to allow, similar to being able to free-form specify the Watcher job definitions?

Also, I guess alerting wouldn't be much of a feature for users without read access to actions πŸ€”

This becomes somewhat complicated, since actions are used outside of Alerting.

mikecote commented 5 years ago

Had a quick chat with @kobelb, adding an update below.

Is there a generic "create user defined alert" which we'll want to allow, similar to being able to free-form specify the Watcher job definitions?

In the future yes with kibana expressions. Users would be able to create alerts from the alerting app. We can register a separate feature for those.

This becomes somewhat complicated, since actions are used outside of Alerting.

Providing a tooltip indicating alerts require read on actions should be sufficient.

alexfrancoeur commented 5 years ago

Is there a generic "create user defined alert" which we'll want to allow, similar to being able to free-form specify the Watcher job definitions?

As Mike mentioned above, this is functionality we will want at some point. Whether it's through expressions or even forms for simple generic alert types. Does this mean we go down the route of @kobelb's option 3? Or provide an experience similar to saved object management?

The saved object management approach sounds more applicable, especially if we end up introducing views for alert history and state. I could see a scenario (though I'm not sure how common it'd be) where users might want an overview of all alerts but admins do not want to provide CRUD permissions. Basically, a role that provides read only access to an alerting app with state / history and nothing else.

This becomes somewhat complicated, since actions are used outside of Alerting.

This will also be relevant for "Kibana Actions" (https://github.com/elastic/kibana/issues/32371) available in future dashboard panels. Dashboard end users will want the ability to add an action / interaction to a dashboard panel that sends to slack, email, etc. or fire off a webhook but may not have write access to actions.

peterschretlen commented 4 years ago

Technical implications aside, alerting is a service provided by Kibana, I think access should be controlled in the context of the consumer of that service (the app or solution). That would favour the 1st option Brandon described.

Even if sub-features were added though, it seems the problem with alerting isn't providing more granular control to the list of "privilege actions" associated with a privilege. It seems like the issue is filtering data for a certain type of saved object? I'm not clear how sub-features would address that problem.

peterschretlen commented 4 years ago

It looks like this is the current set of "privilege actions" we have for alerting:

        "saved_object:8.0.0:alert/bulk_get",
        "saved_object:8.0.0:alert/get",
        "saved_object:8.0.0:alert/find",
        "saved_object:8.0.0:alert/create",
        "saved_object:8.0.0:alert/bulk_create",
        "saved_object:8.0.0:alert/update",
        "saved_object:8.0.0:alert/delete",

Am I understanding right the problem - these privilege actions right now provide access to all alerts, and we need to figure out how to restrict access to only a subset?

legrego commented 4 years ago

Am I understanding right the problem - these privilege actions right now provide access to all alerts, and we need to figure out how to restrict access to only a subset?

Yep, that's correct. If each alert type had its own corresponding saved object type, then the existing privilege model would "just work" once sub-features are implemented. If all alerts share a common saved object type, then we'll need to figure out a way to restrict access to only a subset.

mikecote commented 4 years ago

It seems like the issue is filtering data for a certain type of saved object? I'm not clear how sub-features would address that problem.

Up until now, I was under the impression the sub-features issue would automatically filter data on an object type based on access (subType = "apm" || "uptime"). This isn't the case.

Now that I am on the same page. The issue will diverge into "what technical challenges are we facing to implement feature controls" and discuss how we should solve them.

  1. If creating a saved object type per consumer (apm, stack monitoring, uptime, etc). What developer experience can we provide? The saved object types get created before any plugin initializes.
  2. How will having a saved object type per solution impact event logging? (event logging feature controls?, event logging saved object types? etc)
  3. Is there a better alternative?

I will add more questions over time.

peterschretlen commented 4 years ago

I wonder if this needs to be solved at the Saved Objects level?

It seems like anytime we have one feature (alerting) that intersects with other features (uptime, stack monitoring, apm, ...) we are going to encounter this problem.

Maybe event/audit logging is another instance of this. Having an alerting saved object type per feature might work but feels like the wrong long term approach, especially as we add more types and more stack services features.

peterschretlen commented 4 years ago

@rudolf would be curious to hear your thoughts on this issue - some way to extend Saved Objects namespaces (or filtering) beyond just type? Wondering if this has ever come up before ( I looked through GH but there was nothing obviously similar). If the idea has merit I'll open a separate issue we can continue discussion there.

kobelb commented 4 years ago

We should be able to extend the saved-object authorization logic to allow validation using a field in addition to type, or in-place of type. This would likely fall into something the @elastic/kibana-security team would be responsible for.

Do we intend for all consumers of alerting to abide by the same "schema" and have the same attributes as each other? Or is there perhaps additional justification for using discrete saved-object-types, but allowing these to be dynamically specified by the alerting plugin and its registered consumers?

rudolf commented 4 years ago

@peterschretlen I'm not aware of a similar requirement coming up in the past.

Telemetry is somewhat similar but doesn't require the special authorization needs of actions. Here we use the saved object per consumer model but there is a specific requirement that the "schema" be customizable per consumer.

It seems to me the discussion boils down to do we want (a) saved objects types per consumer or (b) one actions saved object type for all actions. If (b) we need to discuss how the security team can make the authorization model sufficiently powerful. I don't see a strong argument for going with either one, so if authorization seems easier in (a) that seems like the desired way to go.

In my own head I got a bit tripped up by saved object "ownership", the dashboard plugin owns the dashboard type and associated data. But this really is only because the dashboard plugin was the one who declared this mapping. If another plugin created a dashboard_action type that data wouldn't belong to the dashboard plugin, it would belong to the plugin that created that mapping.

Assuming alerting doesn't have the need for custom schemas per solution, we would want actions to "own" all actions saved objects. All this means is in the context of solution (a) is the actions plugin would be responsible for creating the saved object mappings (and possibly future migrations) required to use this plugin.

I haven't investigated this in depth, so there might be some technical challenges, but in the New Platform I imagine that Plugins would "register" Saved Object mappings during setup lifecycle instead of statically defining all mappings upfront. This would allow the actions plugin to register new Saved Object types/mappings on behalf of action consumers. So in the dashboard plugin's setup() lifecycle, it declares it's intent to use actions, the actions plugin would then "register" a 'dashboard_action' Saved Object type/mapping.

I think this would solve the developer experience problem @mikecote brought up, a consumer just has to call the right actions API's, no need to define mappings or schemas. These types are fully owned/controlled by the actions plugin.

Are there big advantages in going with option (b) that I'm missing?

@mikecote Can you explain what you mean with the following, what is event logging in this context?:

How will having a saved object type per solution impact event logging?

mikecote commented 4 years ago

@kobelb

Do we intend for all consumers of alerting to abide by the same "schema" and have the same attributes as each other?

Yes. Where the attributes start differentiating is within alertTypeParams attribute that is specific per alert type. This attribute is marked as enabled: false within Elasticsearch so the mappings within aren't defined. (Note: one plugin can define multiple alert types). By having everything within alertTypeParams we do lose the searchability within it, but have no requirement at this time.

@rudolf

Just changing terminology from actions to alerts. Action saved objects won't be restricted by application and will also be managed separately (by admins for example).

I haven't investigated this in depth, so there might be some technical challenges, but in the New Platform I imagine that Plugins would "register" Saved Object mappings during setup lifecycle instead of statically defining all mappings upfront.

We may have a chicken and egg problem with this if ever the alerting plugin "setup" creates / registers the mappings. The alerting "setup" function is called before the plugins that dependent on it. Only at the other plugins "setup" we would be able to register mappings and the alerting plugin would have to expose its mappings on "start" for it to possibly work?

Are there big advantages in going with option (b) that I'm missing?

This is more a disadvantage of option (a) but I can see the index.mapping.total_fields.limit (1000 default fields) eventually getting hit. I'm not sure how big deal this may be. If we have 13 fields in the alerting mappings and 8 fields for event logs X 10 solutions using alerting, we'd be adding 210 fields ((13 + 8) * 10) to the .kibana index. Again not sure if this is really a problem or not but wanted to take note.

This is also assuming we repeat object types for even logs (apm-alerting-event-log, uptime-alerting-event-log, etc) in order to follow the same privilege model as alerts.

Can you explain what you mean with the following, what is event logging in this context?:

How will having a saved object type per solution impact event logging?

This is something @pmuellr is working on to capture events from task manager, alerting and actions plugin. It's similar to audit but captures different kind of information (when a task started, finished, when an alert type executed, when an alert type failed to execute, etc). You can find the WIP here: #45081.

kobelb commented 4 years ago

Yes. Where the attributes start differentiating is within alertTypeParams attribute that is specific per alert type. This attribute is marked as enabled: false within Elasticsearch so the mappings within aren't defined. (Note: one plugin can define multiple alert types). By having everything within alertTypeParams we do lose the searchability within it, but have no requirement at this time.

This sounds like a different "schema", and we're just working around this by not mapping it in Elasticesearch. Would we have one to many alert-types per "consumer of alerting"?

mikecote commented 4 years ago

Would we have one to many alert-types per "consumer of alerting"?

Yes. Though they can be shared across applications as well. (Ex: SIEM using an ML alert type that would call SIEM actions). Down the road we'll try to provide some alert types out of the box (threshold alert type to begin).

peterschretlen commented 4 years ago

Aside from the chicken-egg problem @mikecote described ( alerting needing the mappings before dependent plugins can create them), and potential for large mappings …. are there any drawbacks to having a large number of saved object types? Does it add any complexity to the alerting plugin, having to be aware of or manage those types? Does this fall in the bounds of proper use of Saved Object types or are we abusing it.

If we take the extreme of one saved object type per alert type, it sounds like might have to double that for history/audit log (for similar security reasons). Say we have 100 alert types, that would give us 200 saved object types. We won't get there overnight, but across solutions that's a reasonable number to expect.

kobelb commented 4 years ago

I think that's a great question to ask, @peterschretlen. I think it'd be good if we came up with general guidelines for when we should be using a different saved-object type, while ignoring the complications that it introduces from the security perspective. We can definitely add the ability to secure sub-sets of saved-objects with the same type; however, it is a reasonable amount of work.

rudolf commented 4 years ago

@mikecote

We may have a chicken and egg problem with this if ever the alerting plugin "setup" creates / registers the mappings. The alerting "setup" function is called before the plugins that dependent on it. Only at the other plugins "setup" we would be able to register mappings and the alerting plugin would have to expose its mappings on "start" for it to possibly work?

When a plugin exposes an API, other plugins can keep on using that API in their relevant lifecycle. So if plugin A exposes an API method registerNewType(name: string) which uses Core's Saved Object Mapping API, plugin B can call registerNewType() in it's own setup lifecycle. So by exposing API's plugins can indirectly continue to do setup work even after they return from their setup function.

I think this dynamism is really valuable in some cases, but the downside is it becomes really hard to track where a mapping comes from, you can no longer search the code base for static mappings.json files.

@peterschretlen I agree that exploding the field count might have some undesired consequences. We will probably have to benchmark the performance. We shouldn't go over ES's 1000 field default limit, but assuming that the Kibana index doesn't have more than an order of magnitude of 10^2 or 10^4 documents performance might be acceptable.

It's hard for me to comment on the "proper" use of Saved Object types without knowing the history that led us to the current design. I suspect having a single index for operational simplicity has been the biggest design goal above performance or ES data-modelling best practices.

@epixa I think your input would be valuable in this discussion.

epixa commented 4 years ago

I have a proposal that I ran by @mikecote, @pmuellr, and @peterschretlen that we think is promising:

App-specific alerting controls

What it would look like

Managing app-specific alerts (e.g. apm, infrastructure, uptime) would be determined by the feature controls of that app. For example, if a user has write access to uptime, then they have access to create or delete uptime alerts.

Managing alerts within an app will be done in that app. For example, there may be "tell me when this service is down" link within the uptime app for creating a site-down alert, which would open up a creation wizard in a flyout panel without leaving the uptime app, and then would be replaced with a "manage alert" link that would open basic edit/delete functionality in a flyout without leaving uptime.

The general alerting UI, which will have more robust controls and provide greater insight (charts, event log) into alerting stuff, will grant access to all alert types within the current space. This would be controlled with a separate alerting feature control (perhaps "alerting management"?).

How it would work

We make no changes to the way authorization works today.

The general alerting UI would be controlled by feature controls tied to the alert saved object type just like most other apps today.

Plugins register alerts in bulk via an alerts.registerAlerts API, which returns an insecure alerting client that is tied to those alert types that were just registered. To expose those capabilities to their UI, they create their own alerting rest endpoints that are protected by feature controls via access: tags (e.g. POST /api/uptime/alerts).

The alerting plugin provides reusable UI components for apps to use on top of those client capabilities in their own UIs.

Upsides

Downsides

Changelog

kobelb commented 4 years ago

Plugins register alerts in bulk via an alerts.registerAlerts API, which returns an insecure alerting client that is tied to those alert types that were just registered. To expose those capabilities to their UI, they create their own alerting rest endpoints that are protected by feature controls via access: tags (e.g. POST /api/uptime/alerts).

Just to make sure I'm following this proposal fully. The proposal is for end-users to not have access to the Alerting saved-object type itself, so they won't be able to interact with the Alerting saved-objects using the standard SavedObjectsClient. Instead, Alerting will return an instance of the AlertsClient, which uses an instance of the following internally:

savedObjects.getScopedSavedObjectsClient(
  request,
  {
    excludedWrappers: ['security']
  }
)

And each of the alerting consumers then relies upon the HTTP API authorization, to restrict access to the authorized end-users?

Assuming the above is correct, the other downside this introduces is we can't have any server-side consumers of the specific AlertsClient because security isn't enforced at this level. It also precludes the ability to interact with alerts using "Saved Object Management". This is where we have the ability to copy a saved-object to a Space, is likely where we will add the ability to share a saved-object to a Space, and how import/export works.

epixa commented 4 years ago

@kobelb End-users that have access to the general alerting UI will have access to alerting saved objects directly. But end-users that are only given access to alerts in a specific app will only be able to access them through the appropriate client. I don't think this precludes interacting with alerts with saved object management if the user has general access to alerts.

Otherwise, that sounds like accurate. I updated my proposal to add the HTTP-only downside. It's possible this is solvable as well, but I'm going to hold off brainstorming about it until we get critical mass on the overall idea.

kobelb commented 4 years ago

The other complication that arises is ensuring that the instances of the AlertsClient which the alerting plugin returns are written in such a way that users are only able to create/update/delete the appropriate subset of alerts.

The way this currently works is "interesting". The security SavedObjectsClient wrapper ensures that the user is authorized to access saved-objects of a specific type and "action", so for example that the end-user can create dashboards. The SavedObjectsRepository itself then ensures that these saved-object types are completely isolated from each other, by including the type in the Elasticsearch document ID. For example, if you tried to create a dashboard with id of 1 and a visualization with id of 1, they are separate Elasticsearch documents.

Since we're now reusing the alerting saved-object type for multiple consumers, we can no longer rely on the implicit protections this document ID scheme gives us. Without additional changes, a create for a saved-object type of alert and a subType of siem with an id of 1 would conflict with a saved-object type of alert, a subType of monitoring with an if of 1. To ensure that users aren't potentially interacting with saved-objects which they aren't supposed to with all create, update and delete operations, we're forced to essentially perform a get before doing an update.

epixa commented 4 years ago

@kobelb I think we can address that conflict problem by not allowing people to specify a custom ID for their alerts, no? The existing conflict avoidance mechanism on ids exists because people were creating dashboards, index patterns, etc. long before we had any sort of centralized logic to encapsulate it and thus developed workflows that relied on custom ids. That's not the case for alerts.

So other than making it possible to disable manual id creation per type, which should be pretty simple, there shouldn't be a need for other changes to the client or repository.

kobelb commented 4 years ago

@epixa That would reduce the complexities for the create operation. However, we still have to worry about updates and deletes. A user with only access to alerting-subtype-1 could potentially update a alerting-subtype-2 saved-object if they know the ID. We're in a similar position with deletes. At a minimum, the AlertingClient will have to perform a get, ensure the saved-object is of the proper subtype, and then execute the update or the delete.

epixa commented 4 years ago

Ah yes. I agree the alerting client will need to verify types on update/delete, and on create too even though that one won't involve a pre-flight request.

peterschretlen commented 4 years ago

@epixa hinted at this in his list of downsides, but I feel like a side effect is we encourage apps to do as much as possible with alerting within their app - they'll even expose their own feature controlled alerting apis - and we discourage integration with the general alerting UI (since it's only accessible to those with access to the alerting feature).

We do want apps to have rich integrations of alerts, but I think part of that is having lightweight integrations (link/navigation) to general views like details, filtered listings, and history.

I like the fact that the proposal gives us solutions-oriented alerting from a single feature control. But to get the full experience, you'd still need the alerting feature access anyway. I'm wondering how likely someone is to use alerting in an app without the full alerting feature access. If it's unlikely then we might as well do everything through the alerting feature (the 'do-nothing' option).

alexfrancoeur commented 4 years ago

I posted this in slack but Mike asked that I add here to persist. It's somewhat related to part Peter's latest comment.

I read through the proposal that Court provided. One workflow question that comes to mind immediately is the alert details page. We had made the assumption that this is the page users would land on when clicking into an alert for the majority of users. If we end up going down this road, do we expect applications to define their own contextual alert detail views (and should it be mandatory in that case) or would we be able to re-use the alert details view in some sort of read only wrapper without providing direct access to the alert management UI?

It's possible the alert details view is just part of the "full alerting experience". If we don't foresee solutions defaulting to different detailed / contextual views, than it feels like most users using alerts would need this view and permissions to access it.

On Fri, Oct 11, 2019, 5:46 PM Peter Schretlen notifications@github.com wrote:

@epixa https://github.com/epixa hinted at this in his list of downsides, but I feel like a side effect is we encourage apps to do as much as possible with alerting within their app - they'll even expose their own feature controlled alerting apis - and we discourage integration with the general alerting UI (since it's only accessible to those with access to the alerting feature).

We do want apps to have rich integrations of alerts, but I think part of that is having lightweight integrations (link/navigation) to general views like details, filtered listings, and history.

I like the fact that the proposal gives us solutions-oriented alerting from a single feature control. But to get the full experience, you'd still need the alerting feature access anyway. I'm wondering how likely someone is to use alerting in an app without the full alerting feature access. If it's unlikely then we might as well do everything through the alerting feature (the 'do-nothing' option).

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/43994?email_source=notifications&email_token=AGAD476JKL24DQXERYMIZFTQODX2TA5CNFSM4IPQG4MKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBBJLFY#issuecomment-541234583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAD472D3WAVEFJSKF3BW53QODX2TANCNFSM4IPQG4MA .

willemdh commented 4 years ago

In my setup, Kibana has a space for each team and teams have a separate role for data access. Teams should never be able to see each other's alerts / watches, as this could contain sensitive information... (such as api tokens or Firewall logs of who surfed at what site).. All data's access controls depend on a custom field we provided, which contains the abbreviation of the team. This implies user in their Kibana Space going to Uptime or Apm will only see the data of applications relevant to them.

epixa commented 4 years ago

@alexfrancoeur Without a significant amount of additional work, we could make a lightweight alert detail flyout that solutions could embed, so when someone gets an APM alert it would bring them to the APM app and a flyout would open with the alert details. For folks that have access to the alerting app, perhaps we'd include a "Alert details" link in the flyout that would bring the user to the extended detail view in the alerts app. This may even be the preferred experience for users with access to the alert app because the most useful place to be when you get an APM alert is probably the APM app (for example).

@peterschretlen My proposal really only makes sense if you accept that most users with write access are going to be given access to the alerts UI and all alerts within a space, so users only having access to alerts in a specific app would be the exception rather than the norm. So essentially we optimize for the case where a user is given access to the alerts feature as a whole, and then we provide a basic integrated experience in the apps themselves for more light-touch alerting interactions.

If that turns out not to be the case, then we can always go down the route of sub features down the line, and that shouldn't even be a breaking change necessarily.

Honestly, if the team doesn't think the majority case is for users with write privileges to be given access to the alerts feature proper on a space by space basis, then the only compelling part of my proposal is the fact that it potentially lets us ship alerts sooner and iterate on the sub feature bits later, and at that point it might just make sense to suck it up and do sub features.

kobelb commented 4 years ago

It sounds like what we really want for Alerting is Saved-objects authorization more granular than type and method.

peterschretlen commented 4 years ago

@mikecote I was thinking about actions feature control, and what it looks like when actions start going beyond builtin notifications.

Say you have an IP blacklisting action for security or maybe a connector that scales a cluster. Anyone with access to use connectors (anyone with read privilege on the actions feature) then has the ability to fire actions with much larger side effects than a notification. At some point we may need to restrict access to certain actions.

I don't think action authorization has to be solved upfront, more something to keep in mind. This type of authorization doesn't seem to fall nicely into feature categories, perhaps more of an object-level-security use case.

mikecote commented 4 years ago

@peterschretlen good point, I have the following notes we can work from:

mikecote commented 4 years ago

POC of the solution to go forward with: https://github.com/elastic/kibana/pull/52807.

mikecote commented 4 years ago

If limiting access to some actions, the following issue will have to be considered: https://github.com/elastic/kibana/issues/63496#issuecomment-617362048.

gmmorris commented 4 years ago

Had a couple of calls with @mikecote & @kobelb to discuss this and I think we're aligned on next steps.

The general direction is going to be that we follow the initial work done in Brandon's POC https://github.com/elastic/kibana/pull/52807 in which he looked into how we could add a custom authorization check done by the AlertsClient. To achieve this we're going to break this down into several steps, each of whom will be done as it's own separate PRs and sub issues of this one (which I'll make into a meta issue).

  1. [x] First we'll address an underlying security issue which is that once we add this custom authorization in the AlertsClient we're going to encounter a potential security hole which is that you could interact with the Alert Saved Objects through the normal SavedObjectsClient without taking RBAC into account. There is currently an approach (as used in Task Manager) in which we use hidden types to prevent this, but the downside is that using hidden types doesn't allow you to use the normal SavedObjectsClient which means you lose the ability to use Spaces and Secruity middleware. To address this we'll start by looking into making it possible to create a custom SavedObjectsClient which uses a repository with access to a hidden type, which would mean we can enjoy the Security & Spaces ability of the SavedObjectsClient while still hiding the Alert type from the regular SavedObjectClient. @kobelb will help to align us with the Platform team on this. PR: https://github.com/elastic/kibana/pull/66879

  2. [x] We will make the Alert SavedObject type a hidden type so that future interaction will be made via the AlertsClient. PR: https://github.com/elastic/kibana/pull/66719

  3. [x] We will make the Action & ActionTaskParams SavedObject types hidden types so that future interaction will be made via the ActionsClient. PR: https://github.com/elastic/kibana/pull/67109

  4. [x] Next we'll follow the POC's example of exposing the checkPrivilegesDynamicallyWithRequest api in the Security plugin and add an security Actions (not to be confused with Alerting's Actions) for alerting which will allow us to add custom authorization which takes the consumer of a SavedObject into account when authorizaing. We'll align with @legrego on this, as this work will be in Security teams' code. PR: https://github.com/elastic/kibana/pull/67157

  5. [x] At this point we should be ready to actually start using custom authorization in Alerting, and we'll begin by adding use of this (as per the POC) in the Alerts Client which will add RBAC around basic CRUD operations for Alerting. This step will ensure we address some of the requirements specified above:

    • Users can only use alert types provided by the applications they have access to
    • Built in alert types are usable to anyone who has access to alerting
    • Users can only read or CRUD alerts for applications they have access to
    • We will attempt making user defined alerts owned by the visualization application / feature
  6. [x] Next we should add the same kind of authorization for Actions, and address the fact that AlertsClient bypasses RBAC by interacting with the action type directly (https://github.com/elastic/kibana/issues/67163). This step will ensure we address the 6 requirements as specified above:

    • Users can only use action types provided by the applications they have access to
    • Built in action types are usable to anyone who has access to actions
  7. [x] TheActions execute api uses the ActionExecutor which in turn uses an internal client to interact with the action SavedObject type - this should go via the ActionsClient to avoid the security hole there.

gmmorris commented 4 years ago

I've began addressing the first step: https://github.com/elastic/kibana/pull/66719 I'll likely be splitting this into 2 PRs - one for the changes to Saved Objects and one for Alerting, just to make reviewing easier, but I wanted to be sure it works as expected end to end before making changes to Platform & Security's code. :)

gmmorris commented 4 years ago

There are now two PRs:

  1. [In Review] adds support for including hidden types in saved objects client https://github.com/elastic/kibana/pull/66879
  2. [Done, but waiting for the first to be merged before opening for review] makes the Alert saved object type a hidden type https://github.com/elastic/kibana/pull/66719

And I am now picking up the next step which is to reevaluate the approach in Brandon's POC.

gmmorris commented 4 years ago

Update:

We've now merged the work that allows us to interact with hidden types (https://github.com/elastic/kibana/pull/66879) and the work that hides the alert SavedObject type (https://github.com/elastic/kibana/pull/66719).

The work that hides the action type is now in review: https://github.com/elastic/kibana/pull/67109

And work has began to actually provide RBAC for the alert saved object types: https://github.com/elastic/kibana/pull/67157

We've also identified a potential security hole in the current AlertsClient as it interacts with the action saved object type directly and it should happen through the ActionsClient so that it handles RBAC for actions: https://github.com/elastic/kibana/issues/67163

There's a comment above which tracks the different steps: https://github.com/elastic/kibana/issues/43994#issuecomment-628746015

gmmorris commented 4 years ago

Had a quick call with @legrego to figure out the moving parts to adding Alerting/Actions specific privileges and seems we're going to need to do the following:

  1. Start by defining the shape of a privilege over alerting/actions, as per the POC (https://github.com/kobelb/kibana/blob/05d1e66ce87044d19f3fbef8b11d924e173b0fba/x-pack/plugins/features/server/feature_schema.ts)
  2. Add an Alerting/Actions security "action" (a URI that uniquely identifies the entity including its consumer etc.) as per the POC (https://github.com/kobelb/kibana/blob/05d1e66ce87044d19f3fbef8b11d924e173b0fba/x-pack/plugins/security/server/authorization/actions/alerting.ts)
  3. Add a builder that can map the privilege to what ES needs, which is missing in the POC, so follow the examples in (https://github.com/elastic/kibana/blob/master/x-pack/plugins/security/server/authorization/privileges/feature_privilege_builder/index.ts)
  4. extend the AlertsClient to use the custom checkPrivileges as per the POC (https://github.com/kobelb/kibana/blob/05d1e66ce87044d19f3fbef8b11d924e173b0fba/x-pack/legacy/plugins/alerting/server/alerts_client.ts)

This should break all the solutions as they won't have privileges anymore.

  1. Add privileges per alert / action type in the solutions.

  2. magic.

I'll probably spike this end to end as a quick way of ensuring this achieves what we want it to achieve then it'll be split into individual PRs so we can gradually introduce this and make it a safer step forward.

gmmorris commented 4 years ago

I'm back to working on this feature, and wanted to make a note about a certain change that will be included in this PR as it's a lot of work and I want to make sure it's the right place to make it.

Under the current implementation, as we do not have RBAC in place, we usr api privileges instead. The way this is implemented a feature needs to specify whether certain roles are granted api level privileges into alerting. You can see this implemented here by siem: https://github.com/elastic/kibana/blob/afbbafb0cfb556cda333d0b58dd81e8e424a5a6e/x-pack/plugins/siem/server/plugin.ts#L141

What siem are specifying here is that if a user has the all privilege for siem, then they also gain api privilege or alerting-read & alerting-all into alerting.

The new RBAC system on the other hand, is going to use traditional Kibana role privileges to align with similar features such as _saveobjects and this will be enforced by the AlertClient.

When introducing the new RBAC based privileges we can take two approaches:

  1. Keep the api level privileges and introduce RBAC along side it. This would mean you have a two gate system - you need RBAC to use alerting through solutions and through the UI, but need api privileges to use the HTTP endpoints.
  2. Remove the api level privileges and rely on the fact that the HTTP endpoints need to use the AlertsClient to do anything (this is enforced by the previous work we did to hide the SavedObject types)

Discussing this topic with @mikecote a couple of weeks ago we decided to go with the second option and remove API level privileges as it would mean a simpler privileges system and less code to maintain that achieves the same thing.

But, this has a knock on effect - where api privileges behave slightly differently than RBAC is that users who have no privileges for a certain endpoint are responded to with a 404 error, hiding the API from the user. On the flip side, RBAC based privileges as we see in SavedObjects are a little different - a 404 is only returned when an entity (in our case, an Alert) that has been requested doesn't exist. In the case of a lack of privilege for a certain operation (be it get, create or whatever) what we'll actually reply with is a 403 explicitly telling the user that they aren't authorised to perform the operation they've asked us to perform.

This change is going to require changing pretty much every test in our Alerting test suite, which makes sense, but is quite a big diff- so I wanted to flag it before I get too much of it done πŸ˜†.

There was another option - introduce a test fixture that bypasses the normal HTTP API endpoints to test the authorisation without using the api privileges which we'd use to test and introduce the new RBAC privileges system along side the api privileges. Then, once that PR is merged, produce a follow up PR which removes the api privileges and their associated tests, then that fixture could be removed and its usages in other plugins. The downside to this approach is that it means writing a lot of code solely for the purpose of this PR, which would then be removed in the followup PR. I raised this options with Mike and again, we both agreed this was redundant and we could just go with option 2, but again, I wanted to flag this early. :)

Let me know if there are any concerns - For now I'll continue to progress with option 2 in mind.

kobelb commented 4 years ago

@gmmorris I agree with the decision that you and @mikecote came to. I still think that option 2 is the "right solution", even though it does require numerous tests to be updated...

pmuellr commented 4 years ago

Yeah, option 2 seems right. I was just wondering this weekend when we were going to be returning some 403's instead of the 404's we currently return, because it seemed to me that it wasn't right that we were returning 404's in all those cases.

gmmorris commented 4 years ago

I've made quite a bit of progress on this (PR https://github.com/elastic/kibana/pull/67157) and not that I think I'm done ironing out edge cases I think we've found a good model that allows solutions to express privileges in a simple way.

I thought it might be worth expressing the model in terms of the various controls we've had to provide solutions utilising alerting.

Assuming each solution that wants to add alerting needs to be able to control the following:

Here is how we address these different controls. Each solution can add a list of AlertTypes under their read and all keys of the privileges object when defining their feature, like so:

features.registerFeature({
      id: 'alertsExample',
      name: 'alertsExample',
      app: [],
      privileges: {
        all: {
          alerting: {
            all: ['example.always-firing', 'example.people-in-space'],
          },
        },
        read: {
          alerting: {
            read: ['example.always-firing', 'example.people-in-space'],
          },
        },
      },
    });

What this specifies, in detail, is that:

  1. If a user has been granted the all privilege to the alertsExample feature, then they are also granted all privileges to the example.always-firing and example.people-in-space AlertTypes under the alertsExample consumer
  2. If a user has been granted the read privilege to the alertsExample feature, then they are also granted read privileges to the example.always-firing and example.people-in-space AlertTypes under the alertsExample consumer

That means that an all user will be bale to, for example, create an example.always-firing alert with the alertsExample as consumer. This will also automatically grant them the right to create an example.always-firing alert from within alerts management where alerts is the consumer (we special case Alerts management). What this doesn't grant the user is the ability to create an example.always-firing alert under any other consumer. For that the specific consumer will have to grant the user explicit rights through their privilege system.

For example, if Uptime wanted to allow users to create an example.people-in-space alert inside of the Uptime solution, then they will have to do the following:

features.registerFeature({
      id: 'uptime',
      name: 'Uptime',
      app: [],
      privileges: {
        all: {
          alerting: {
            all: ['xpack.uptime.alerts.actionGroups.tls', 'example.people-in-space'],
          },
        },
        read: {
          alerting: {
            read: ['xpack.uptime.alerts.actionGroups.tls', 'example.people-in-space'],
          },
        },
      },
    });

This, assuming it's added by Uptime, would grant uptime users the privilege to create both their own xpack.uptime.alerts.actionGroups.tls alert and the example.people-in-space alert with uptime as the consumer.

You might ask, will this allow any Uptime user with all privileges to create an example.people-in-space alert? No. In order to create an example.people-in-space alert the Uptime user needs both all in Uptime and in AlertsExample as we always check whether the user is privileged to execute an operation (create/enable/delete etc.) in both the alert's consumer and its producer.

The one exception to this is when the producer is alerts, which represents a built-in AlertType, in which case we only check for consumer privileges as all users are privileged to create built-in types by definition.

We were considering requiring explicit privileges to the Alerts feature, but decided against it for the time being, to align with other stack based solutions where it seems access to Kibana is enough to allow you to execute basic operations. This doesn't mean access to data is open- all alerts mimic the user that created them, so a non-privileged user will still be limited to what they're allowed to see anyway, and they will only be able to create and manage built-in types which were created from the Alerts Management, rather than any solution.

I think this model addresses all our needs for privileges around Alerts. Once this is approved and everyone is happy, we'll implement the exact same model for Actions.

kobelb commented 4 years ago

I'm struggling to think of a concrete example which requires the producer part of the authorization check. Do we have a situation already which requires this?

Also, is there any situation where plugin a produces an alert-type 1 while plugin b consumes it and we'd like for the user to be able to access the alert-type 1 when they only have access to privileges defined by plugin b? Is this supported?