apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
62.46k stars 13.73k forks source link

[SIP-51] Dashboard Level Access #10408

Closed amitmiran137 closed 3 years ago

amitmiran137 commented 4 years ago

Motivation

As a dashboard provider in an organization with many subgroups inside I need the ability manage user access to dashboards and different levels of permissions(read, write, granter, owner)

use cases

1) In our org we will have hundreds of dashboards that will be based on the same dataset Therefore there is no way to manage dashboard access for specific dashboards for specific users 2) access to dashboard metadata - as a dashboard creator I have sensitive data on the dashboard metadata (iframe, plain HTML, markdown) I want to restrict access to dashboard metadata 3) current Enterprise BI solutions offer this content-type level of permissions e.g: Tableau 4) we want to make sure that /dashboard/(/) is also enforcing access. Currently, anyone can access any dashboard by just changing the URL. Sometimes there is PII on the dashboard itself like plain HTML text or an iframe so it still exposes sensitive data on the dashboard which is problematic to some of our clients 5) Airbnb and probably other orgs are fully dependent on dataset level access - they would not handle an extra dashboard level permission 6) Need to give just dashboard viewing rights and dashboard download rights to users 7) In certain Enterprises. for example financial services, it is often required to limit the accessibility of data to certain people and to have the ability to manage this centrally. This means that users only have a limited ability to publish results in dashboards to a broader public.](https://github.com/apache/incubator-superset/issues/11198)

Proposed Change

By using the RBAC principle and linking roles directly to a dashboard we can enforce an additional layer on top of the existing access mechanism that will permit access to a dashboard if the user has access to any of the Dashboard roles

100% backward compatibility to the existing dashboard security mechanism which is based on datasets

roles linked to a dashboard would provide either read or edit access

development milestones

  1. enforce dashboard security using roles
  2. allow token-based access to /explore_json and API/v1/chart/data to allow reading datasets only by have dashboard access for read-only purpose

Rejected Alternatives

LEVEL_ACCESS_MODE= //options 'Dashboard'/ 'Dataset' this option doesn't allow both options to co-exist and prevents Dataset access based existing solution to use the new ability none

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the label #enhancement to this issue, with a confidence of 0.93. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

itsik-avidan commented 4 years ago

πŸ‘

amitmiran137 commented 4 years ago

@villebro the SIP we discussed

altef commented 4 years ago

This would be wonderful for my use-case as well

LiranBri commented 4 years ago

+1

SaharExelate commented 4 years ago

Cool!

yishaifri commented 4 years ago

yes!! cool idea

willbarrett commented 4 years ago

To comply with the current usage of Admin/Alpha/Gamma, I'd recommend only adding all_dashboard_editor_access to Admin and Alpha - Gamma is currently a much more limited user that requires explicit access to data.

bkyryliuk commented 4 years ago

Would it make sense to consider automating creation of the datasource permissions while granting access to the dashboard. E.g. user will request the access to the tables that back the dashboard & owner will be able to grant / approve those with 1 click ?

Creating dashboard permissions to the user opens up some edge-cases as there will be 2 ways to validate the access, access to dashboard that can change underlying sql & datasource access [ table, schema, database level]:

  1. owner changes the dashboard and grants access to the undesired
  2. what will be the explore from dashboard behavior for the read only users and how it will interact with the dashboard permissions ?
amitmiran137 commented 4 years ago

Hey @bkyryliuk We plan to automate creation of the datasource permissions while granting access to the dashboard externally to superset by create a matching role for each created dashboard and assigning all the relevant permissions to it(Dashboard, datasources)

I think that explore from dashboard should be available for only dashboard edit access permission and owners

hyramduke commented 4 years ago

+1

amitmiran137 commented 4 years ago

5483

mistercrunch commented 4 years ago

Something to clarify here: what does happen when the user has access to the dashboard, but doesn't have access to the underlying datasets?

If we assume that there's a need for people getting access to a dashboard, but not being able to explore the dataset(s), because say, there's some level of detail in the underlying dataset that the granter does not want to expose for some reason. We need to clarify the following topics:

amitmiran137 commented 4 years ago

@mistercrunch

Giving access to a specific dashboard can be sufficient enough for the user to view the dashboard without the complexicity of adding all underlying datasources πŸ‘ πŸš€

john-bodley commented 4 years ago

Currently only owners + admins are able to edit dashboards. I'm interested in knowing whether the current view access and ownership controls are suffice in order to provide the "viewer" and "editor" functions?

This seems like a far simpler approach as is consistent with the slice model and requires no code change.

bkyryliuk commented 4 years ago

Currently only owners + admins are able to edit dashboards. I'm interested in knowing whether the current view access and ownership controls are suffice in order to provide the "viewer" and "editor" functions?

This seems like a far simpler approach as is consistent with the slice model and requires no code change.

agree with @john-bodley proposal, it looks like current permission model supports this use case.

john-bodley commented 4 years ago

I partially misspoke earlier, currently there are no access controls explicitly at the dashboard level, it's merely a series of rules.

I do think the community needs to collectively decide whether security should be at i) the datasource level (either Superset datasource or the underlying database, schema, etc.), the ii) chart/dashboard level, or iii) a combination of both (i) and (ii). Currently it's (i) (for right or wrong) and aspects of dashboard level access could be achieved by row level access and/or dashboard specific Superset datasources. There is additional overhead with this approach, however it's simpler to grok, the access patterns are likely more secure (people could exploit dashboard level access controls), and doesn't require additional logic or development of request/approval/management flow.

amitmiran137 commented 4 years ago

Hey @john-bodley just noticed this PR merged #11024 it actually fully covers the dashboard_edit_access permission concept so I think I'm going to change the proposal to focus only on access in the general concept of it

amitmiran137 commented 4 years ago

I partially misspoke earlier, currently there are no access controls explicitly at the dashboard level, it's merely a series of rules.

I do think the community needs to collectively decide whether security should be at i) the datasource level (either Superset datasource or the underlying database, schema, etc.), the ii) chart/dashboard level, or iii) a combination of both (i) and (ii). Currently it's (i) (for right or wrong) and aspects of dashboard level access could be achieved by row level access and/or dashboard specific Superset datasources. There is additional overhead with this approach, however it's simpler to grok, the access patterns are likely more secure (people could exploit dashboard level access controls), and doesn't require additional logic or development of request/approval/management flow.

there is a IV) option which is co-existing (i) or (ii) , depending whether the feature flag is off or on

villebro commented 3 years ago

Thanks for updating the SIP @amitNielsen ! I think the current state of the proposal captures the motivations and required changes well.

If we can keep this behind a feature flag and not cause backwards incompatible changes, then I'm all for the change. By introducing the option to allow access to dashboards/charts with a signed token provided by the backend, we should be able to do this securely, without compromising leakage. I believe it should be possible to keep the necessary code changes mostly within the security manager, avoiding unnecessary additional code complexity.

I envision a flow that would look something like this: 1) User requests dashboard. If user is not permitted to access the dashboard, a 403 is returned. If user does have access, a signed token based on the user id and dashboard are returned with the bootstrap data. 2) the dashboard requests data for all charts in the dashboard, appending the token to the requests. 3) the backend verifies that the token is valid, and ensures that the chart is indeed in the dashboard. If something doesn't add up, a 403 is returned, otherwise the chart data is returned.

shashankkoppar commented 3 years ago

I would also like this! It would be handy to share dashboard level access! Since most of engineers working on this are non tech, so they wont understand if which dataset or database or stuff .

Current dashboard permissions is heavily dependent on dataset permissions. In our scenario, we have dashboards which we want to share to different clients which use same dataset, then it is difficult, since they both can see all dashboards created by this dataset. Would like to know when it is possible to have dashboard level access!

amitmiran137 commented 3 years ago

Calling for responses for the final summary of this sip anyone?

villebro commented 3 years ago

To echo @amitmiran137 's request, it would be important for the community to be vocal about the updated proposal in the SIP description. Without comments either for or against the proposal, it's very difficult to assess whether or not the current proposal is in line with how the community expects dashboard access control to work. So if this feature is something you are interested in seeing implemented, please read through the description of the SIP and post your thoughts here.

altef commented 3 years ago

This looks awesome! (Though I am admittedly biased; dashboard level permissions are pretty high on my Superset wishlist.)

ktmud commented 3 years ago

I partially misspoke earlier, currently there are no access controls explicitly at the dashboard level, it's merely a series of rules. I do think the community needs to collectively decide whether security should be at i) the datasource level (either Superset datasource or the underlying database, schema, etc.), the ii) chart/dashboard level, or iii) a combination of both (i) and (ii). Currently it's (i) (for right or wrong) and aspects of dashboard level access could be achieved by row level access and/or dashboard specific Superset datasources. There is additional overhead with this approach, however it's simpler to grok, the access patterns are likely more secure (people could exploit dashboard level access controls), and doesn't require additional logic or development of request/approval/management flow.

there is a IV) option which is co-existing (i) or (ii) , depending whether the feature flag is off or on

I'm a strong believer that feature flags should only be used for experimental features, not something we will support in the long term. If we find value in one way or the other, we should just make one of them the default and clean up the other.

In this case, I don't see why we can't add the dashboard-level access on top of the dataset access control.

rumbin commented 3 years ago

Personally, I feel that all data access should be based on the role-based access control of the underlying Data Warehouse. In many companies, Superset will not be the sole tool to access the data. So, if data access were to be controlled on a BI tool level, there would be a severe overhead of implementation the same data governance rules for each of the tools and access methods. I'd greatly prefer to focus on getting der User Impersonation (a.k.a. User Principal Push Down) feature robustly working for the supported databases. This way we would not require Dashboard Level Access Control at all: the users can access the charts depending on whether they are allowed to access the underlying tables/views. All other solutions may rip holes into the company's data governance model in my eyes. However, I am aware, that features like Caching of the chart's query results get much more complex, especially when the Data Warehouse supports column- or row-based access control.

villebro commented 3 years ago

I partially misspoke earlier, currently there are no access controls explicitly at the dashboard level, it's merely a series of rules. I do think the community needs to collectively decide whether security should be at i) the datasource level (either Superset datasource or the underlying database, schema, etc.), the ii) chart/dashboard level, or iii) a combination of both (i) and (ii). Currently it's (i) (for right or wrong) and aspects of dashboard level access could be achieved by row level access and/or dashboard specific Superset datasources. There is additional overhead with this approach, however it's simpler to grok, the access patterns are likely more secure (people could exploit dashboard level access controls), and doesn't require additional logic or development of request/approval/management flow.

there is a IV) option which is co-existing (i) or (ii) , depending whether the feature flag is off or on

I'm a strong believer that feature flags should only be used for experimental features, not something we will support in the long term. If we find value in one way or the other, we should just make one of them the default and clean up the other.

In this case, I don't see why we can't add the dashboard-level access on top of the dataset access control.

The way I see it this will be implemented as a config flag, so one can choose to use datasource and/or dashboard access control, defaulting to the current datasource option.

always-busy commented 3 years ago

good SIP

amitmiran137 commented 3 years ago

Personally, I feel that all data access should be based on the role-based access control of the underlying Data Warehouse. In many companies, Superset will not be the sole tool to access the data. So, if data access were to be controlled on a BI tool level, there would be a severe overhead of implementation the same data governance rules for each of the tools and access methods. I'd greatly prefer to focus on getting der User Impersonation (a.k.a. User Principal Push Down) feature robustly working for the supported databases. This way we would not require Dashboard Level Access Control at all: the users can access the charts depending on whether they are allowed to access the underlying tables/views. All other solutions may rip holes into the company's data governance model in my eyes. However, I am aware, that features like Caching of the chart's query results get much more complex, especially when the Data Warehouse supports column- or row-based access control.

I totally agree with the use case described above btw. but there are use-cases like internal organization data access that doesn't require any governance but just access on the end delivery to the client which are the dashboards themselves

this is why we it should be up to the organization to decide with what mechanism they want to use: either dataset_level or dashboard_level

JKHeadley commented 3 years ago

This is exactly what my company needs. A simple way to share dashboards with clients without giving them access to the underlying datasource.

amitmiran137 commented 3 years ago

If anybody still haven't voted https://lists.apache.org/x/thread.html/r91c0acde742a9a719a30938d708e945890ade944ea476dc1def61975%40%3Cdev.superset.apache.org%3E

ktmud commented 3 years ago

The way I see it this will be implemented as a config flag, so one can choose to use datasource and/or dashboard access control, defaulting to the current datasource option.

Where is this config flag set? Globally by Superset admins or at dashboard level? Can companies decide to use both modes at the same time? I.e., certain group of users can access some dashboards, but not all data in that dashboard since the data query may still go through a custom SecurityManager for access control.

amitmiran137 commented 3 years ago

This Flag can be introduced at the global level for a default behaviour and at the dashboard level for overriding the default

@ktmud

ktmud commented 3 years ago

@amitmiran137 If users can switch this flag at the dashboard level, it means at least part of the system will contain dataset-based access control. Will dashboards with LEVEL_ACCESS_MODE= 'Dashboard' enabled still subject to any dataset level access control configured? If yes, what's the benefit of having a flag? Can we just: 1). always apply the dataset level control to the charts; 2) apply the dashboard-level access control when it is configured per dashboard (could be default to the dashboard creators' own groups)?

amitmiran137 commented 3 years ago

@ktmud dashboards with LEVEL_ACCESS_MODE= 'Dashboard' enabled will not be subject to any dataset level access.

The problem with option 2 : you want to give viewing access and not ownership for managing the dashboard

How would you suggest to manage list of users have view access to a dashboard ? Isn't a list of users equals to a role ? Why create a new access model instead of leveraging the RBAC existing one

ktmud commented 3 years ago

dashboards with LEVEL_ACCESS_MODE= 'Dashboard' enabled will not be subject to any dataset level access.

This will be a major turn off for any organization who needs dataset level access control.

My 1) and 2) were not two options but two steps of one solution. Basically we keep current dataset level access control unchanged but add a new layer of dashboard access control.

In terms of actual implementation, you could still leverage the existing RBAC by adding an roles attribute to dashboards and a custom can_access_dashboard/can_edit_dashboard method to SecurityManager:

  1. Add a new column roles to the Dashboard model, which stores FAB roles that corresponds to a business unit/user group's dashboard view or edit role. We don't allow specifying view access by users as it unnecessarily complicates things.
  2. When publishing a dashboard, users choose which roles/user groups have access to this dashboard
  3. Add dashboard_access to OBJECT_SPEC_PERMISSIONS
  4. Add can_access_dashboard and can_edit_dashboard to SecurityManager which passes the right permission names and view names to can_access based on roles associated with the dashboard. E.g.
    if not dashboard.roles:
        return True
    for role in dashboard.roles:
        if self.can_access("dashboard_access", dashboard.perm_for_role(role, edit=False)):
            return True
    return False
  5. Place these checks manually in the API, just like what we do for datasources.

In short, I don't think an "access mode" switch is necessary, as current security model seems to already suffice in supporting the additional layer of role-based dashboard-level access control and the only extra work is adding a new roles column (like we already have in datasource.perm and datasource.schema_perm, except we add roles instead of perm to have the ability to enforce foreign key check.)

There could be a toggle to allow pulling query results from controlled datasource even if users don't have direct access to datasource---but that has its own level of complexity and doesn't seem to be blocking us from implementing the basic dashboard-level access control.

This is obviously a popular user demand and I'm all for addressing it, but let's make sure the final solution is as prudent as possible.

junlincc commented 3 years ago

https://github.com/apache/superset/issues/9725

rajraousb commented 2 years ago

Hi all, I have an alternative solution here: https://github.com/apache/superset/issues/17914 and https://github.com/apache/superset/issues/17913 Can you review and provide comments?

Thanks

shefaliaiyanna commented 2 years ago

How do we give access to external users via superset to view the dashboard without the localhost link. How do I make it a generic URL?