data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
228 stars 82 forks source link

Revoking shared access from an specific table does not block data access in QuickSight #358

Closed diandramelo closed 1 year ago

diandramelo commented 1 year ago

Hi there,

I was recently testing author rights on QuickSight and it is great that data.all is able to coordinate Lake Formation data access in terms of QuickSight datasets generation throught Athena!

The only thing I noticed is that, when a user creates a QuickSight dataset within a table from data.all catalog, if we revoke access to that specific table later on, this user is still able to see the data on QuickSight (even in direct query format) and also generate analysis from this data (that theoretically is no longer accessible).

This is what I have tested with an user:

  1. Create a dataset with 2 tables: A and B;
  2. Grant data access to the entire dataset for this user (tables A and B included);
  3. From the user account (QuickSight author), create a QS dataset in direct query mode with table A;
  4. From administrator account, revoke access rights from table A only (leaving access to table B in the same dataset);
  5. From the user account, access the QS dataset of table A (with access revoked): the data within was refreshed and the user could create analysis / dashboards from it;
  6. From the user account, try to create a new dataset from the same table from Athena source: the table was no longer available within the Glue data catalog choices.

Could you tell me if it is possible to also block access to existant QS datasets once the table rights were revoked?

Thanks in advance!

dlpzx commented 1 year ago

Hi @diandramelo, thank you for reporting! I will have a look and try to reproduce because this is definitely something that is not intended. Bests,

dlpzx commented 1 year ago

I have checked the code and in the revoke access task, the data.all share manager revokes access in LakeFormation to the Quicksight group. That is why in your point 6 you cannot see the table. I am going to perform some tests and contact Quicksight experts to understand how is Quicksight handling permissions for already existing QS datasets

dlpzx commented 1 year ago

Hi @diandramelo, I did the following:

  1. share a dataset with 2 tables "raw" and "raw1" from envA ro envB
  2. user from envB creates a QS dataset with table "raw"using direct query mode and creates an analysis with some visual. I can see the data.
  3. user from envA revokes access in data.all to table "raw" to user and team in envB
  4. user from envB cannot access the revoked table and gets the following (first pic from the dataset and second one from the analysis)
image image

For me the underlying Lake Formation permissions prevent me to access data from revoked tables in Quicksight. Maybe you can attach some screenshots or check if I am missing something in my test because right now I am not able to reproduce your issue

diandramelo commented 1 year ago

Hello @dlpzx,

Thank you for your help!

It's weird, because in my case the same test gives me access to the data into the revoked table... But in my case I stay into the same environment, we only did the sharing procedure in a Team level.

Ok, I'll share with you some screenshots (I hope it is clear enough):

  1. In a same dataset we have two tables: dossier and _dossiercomposition, created by the administrator account image

  2. An user from the marketing group requested access to the entire dataset, and was able to create a QuickSight dataset in direct query mode from the table dossier: image

  3. The administrator user removed the marketing team access to the table dossier, and in the marketing user account we see that the user has access only to the table _dossiercomposition: image

  4. Even after this step, when the marketing user enters into his dossier QuickSight dataset, it can still see the data in direct query mode as in step 2 (I wouldn't be surprised if it could still see from SPICE, since the data would be stocked, but maybe the user would be not able to make a refresh into the dataset because it would have no authorisation to query this table).

Ps.: @dlpzx I would like to apologize if this type of question is not authorised by the community, but since I could not find your personal contact on your GitHub page I was wondering if you would be available to help the BI team in my company to clarify some data.all functions and see if we would be able to work as we expect in terms of data sharing. We are trying to explore data.all code with our developper but we still have some issues to resolve before diving into a data.all based data management. I am available through email / LinkedIn (in my GitHub profile) in case you are interested!

dlpzx commented 1 year ago

Hi @diandramelo, thanks for the detailed answer :) I think I know your issue, it is related to #86. Basically for each AWS account data.all creates one single Quicksight group, so all data in the environment can be accessed by all teams in an environment. This is mostly due to the original implementation of data.all which was AWS-account based. The mapping of Quicksight groups with data.all teams has been long in our roadmap. If clients and users show interest we will prioritize this task in our plans, so please feel free to describe your situation and share your thoughts on how it should work. We will take your opinion into account.

dlpzx commented 1 year ago

Closing due to inactivity. Re-open if needed.