Closed chamiles closed 3 months ago
Hi @chamiles, for column level sharing we opened an issue a while ago. But there was no much interests (#84 ), instead we opted to implement Tag-based access control using Lake Formation, which also allows column level granularity. It is a feature in-progress at the moment developed in this issue #186
For row/cell level filtering we can work together on implementing this sort of data filters in shared tables.
Bumping this up, This issue will be used to track the enhancement and extension of the current data.all sharing functionality to include column-level and row-level access control in Lake Formation. This has been one of the top requests from customers based on our conversations. This will be included as a feature enhancement for completion in v2.7.
@anmolsgandhi I think it is important we get the UI right to make it clear that this can only for table sharing and if you grant S3 access then you're effectively breaking this security measure because user will have access to the entire bucket and dataset behind the table. Let's discuss this.
Hi @zsaltys - I am starting to work on design for this and can start sharing some mock ups of the UI here as I continue to progress this week
I agree some type of warning or call out from FE would be helpful to ensure user understands what they are doing when sharing data objects - but I think we can add warnings more so from the bucket sharing side because sharing an entire bucket is akin to sharing ALL folders and tables in data.all
I will post additional design description on this issue today
User on the Dataset Owner Team navigates to a particular S3 Dataset Table
Data Filters
where only the Dataset Owners can viewData Filters
Tab a User can:Creating a new Data Filter
Column
or Row
Filter TypeColumn
Filters --> User will be able to select a subset of columns to include in the filter (all columns not selected are excluded)Row
Filters --> User will be able to create a row expression to filter by row valuescolumnName
+ operator
(i.e. =
. <
, etc.) + value
=, !=, >, >=, <, <=, IS NULL, NOT NULL, IN, NOT IN
=, !=, LIKE, NOT LIKE, IS NULL, NOT NULL, IN, NOT IN
=, !=, IS NULL, NOT NULL
https://github.com/user-attachments/assets/3dae0bd2-916f-4f26-bce1-67af5ad4d1da
https://github.com/user-attachments/assets/b6088731-527e-4c8e-a403-faf2d5dd3273
https://github.com/user-attachments/assets/9414c99f-43c7-49b2-8dba-266d62a197e1
Frontend
Backend
Data Filter CRUD (In Progress)
[x] Create Data Filter API
[x] Delete Data Filter API + Throw Exception if Existing Shares
[x] List Data Filters API
[x] Attach Permissions on New Table Creation
[x] Delete All Filters on Table Delete
[x] Throw Exception on Delete Filter if Existing Shares
[x] Remove Permission on Table Deletes
DB (In Progress)
[x] Create Data Filter Table
[x] Add Data Filter Column to Share Items
Dataset Shares
[x] Add Data Filters to Share Item Record (Tables only)
[x] S3 Dataset Table Share Processor Updates (View Comment Below for more Detail!)
Additional Work
Currently when we share a table cross account we follow the following steps:
0) Check if source account details are properly initialized and initialize the Glue and LF clients
1) Grant ALL permissions to pivotRole for source database in source account
2) Create the shared database in target account if it doesn't exist
3) Grant permissions to pivotRole and principals to "shared" database
4) For each shared table:
a) Update its status to SHARE_IN_PROGRESS with Action Start
b) Check if table exists on glue catalog raise error if not and flag share item status to failed
c) If it is a cross-account share:
c.1) Revoke iamallowedgroups permissions from table
c.2) Grant target account permissions to original table -> create RAM invitation
c.3) Accept pending RAM invitation
d) Create resource link for table in target account
e) If it is a cross-account share: grant permission to principals to RAM-shared table in target account
f) grant permission to principals to resource link table
g) update share item status to SHARE_SUCCESSFUL with Action Success
Most importantly - we re use the shared DB and the resource link table in the target account and then add additional grants for new principals who get approved access to shared data
When it comes to data filters - herein lies an issue because:
(Option 1 - NOT VIABLE) Sharing the table w/ assigned filters to the external account
(Option 2 - NOT VIABLE) Attempt to create Filters in Target account after resource link is created
(Option 3) Sharing the table w/ assigned filters directly to the Foreign IAM Principal
{table_name}_{filterUris}
If following alogn with Option 3 above - adding additional details here
Cross-account grants made using the named resource method are compatible across different versions. Even if the grantor account is using an older version (version 1 or 2) and the recipient account is using a newer version (version 3 or higher), the cross-account access functionality operates seamlessly without any compatibility issues or errors.
To share resources directly with IAM principals in another account, only the grantor needs to use version 3.
https://docs.aws.amazon.com/lake-formation/latest/dg/optimize-ram.html
In the proposed option - we do the same DB steps as before which is
But instead of
We do
Originally if TableX was shared to same cross account to GroupA and GroupB we would have
Now if TableX was shared to same cross account to GroupA (w/ Filter1) and GroupB (w/ Filter2)
Could not grant principal QS_GROUP_ARN permissions ['DESCRIBE', 'SELECT'] and permissions with grant options None to {'TableWithColumns': {'DatabaseName': 'DB_NAME', 'Name': 'TABLE_NAME', 'ColumnWildcard': {}, 'CatalogId': 'SOURCE_ACCOUNT'}} due to: An error occurred (InvalidInputException) when calling the GrantPermissions operation: Cross account requests are only allowed for AWS Accounts, Organizations, IAM Principals and All IAMPrincipals
Hi @noah-paige, I love the UI views! Here are some remarks on the design and the table findings:
Findings from testing:
Assigning 2 Column Filters
Assigning 1 Row and 1 Column Filter
Assigning 2 Row Filters
Customer has enabled granular (Row, Column, Cell) sharing using lake formation sharing and would like to see that capability in data.all sharing request, and in approval area. So that data owners and share the same datasets with restricted access to columns with out having to create duplicate another dataset and data.
Customer current solution is to use data filters and typing in a Manuel expression, this may be the best open a text area to add expression, but best user experience would show column names in a visual checkbox way with ability to put in expressions for rows and cells.
https://docs.aws.amazon.com/lake-formation/latest/dg/data-filters-about.html