data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
226 stars 82 forks source link

Quicksight groups mapped with data.all groups #86

Open dlpzx opened 2 years ago

dlpzx commented 2 years ago

Is your idea related to a problem? Please describe. In the current implementation when users start a Quicksight session they are added to a single default group called 'dataall'. All new users are added to this group. They have access to the whole Glue Catalog in the account.

Describe the solution you'd like I would like to use groups in Quicksight the same way that I use teams in data.all. That means that when users start a Quicksight session they should start a session with a team and in this session they see only the data owned or shared with that data.all team.

Drafted solution PART 1: groups and users (~16 days)

PART 2: data access (~ 4 days)

------------------------------------------------

Resources needed:

------------------------------------------------

PART 3 (not included as part of this feature request): data access with data sources After part 2, data sources and data sets created in Quicksight can be shared by the creator to any user and group in Quicksight. Meaning that data access to the data in those Quicksight resources is not managed through Lake Formation or through the data.all sharing process.

We can leave the responsibility of sharing the datasets and data sources to the creators, which will always be part of the data.all dataset owner group or requester groups. If we want to implement a way in which from Quicksight users are just data consumers, then we need to work with custom permissions and data-source sharing, which is out of the scope of this issue. I will open another GitHub issue for discussion.

enr0c commented 7 months ago

i do see high value for the implementation of the proposals especially for an enterprise environment. The current 'one-fits-all' approach with managing access to data via quicksight management control is not workable on a larger scale.

dlpzx commented 6 months ago

Hi @enr0c thanks for the response! We will try to prioritize this feature. Can you describe more in depth how do you currently use Quicksight?

enr0c commented 6 months ago

We are not using data.all currently, but we are assessing if we will use it in the future. One challenge we do currently see is the usage of quicksight and the fact that user that belong to an environment do have access to all dashboards, regardless if the underlying data is not accessible to them. Furthermore one needs to manually select Athena sources for users per environment.

We foresee many thousand QS users from 20-40 AWS account. Identify will be delivered by an external IdP, leveraging SAML or OIDC. Data Access Control shall be implemented with Lakeformation

Thank you for asking :)

enr0c commented 6 months ago

One further clarification - The ticket says: That means that when users start a Quicksight session they should start a session with a team and in this session they see only the data owned or shared with that data.all team.

This is already an improvement, however for our case not sufficient.

Our requirement

We foresee a multi-Account setup. In this ecosystem, multiple roles (delivered via external IdP via SAML) can be assigned to a user. Permissions are "additive", the union of the permission shall be applied. I can be in a team, but that should not determine if I have now access to data or not. And the data permission should not be related to any team I belong to…

For important components (e.g. Quicksight or Sagemaker Studios) it is currently not working in data.all.

Example with Sagemaker Studio, can be applied 1:1 to Quicksight

Than there are two tables (in glue for example sitting in two different accounts:

Current behavior

Expected behavior for the two users: User Role ML Studio Access Read goalkeepers table (Quicksight, sagemaker, athena, ...) Read fouls table (Quicksight, sagemaker, athena, ...)
User A $\color{blue}{\textsf{playersRW}}$ , $\color{green}{\textsf{sanctionsRW}}$ yes yes yes
User B $\color{blue}{\textsf{playersRW}}$ yes yes no
User C $\color{green}{\textsf{sanctionsRW}}$ no no yes