data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
228 stars 82 forks source link

Migrating from manual pivotRole to cdkRole makes table shares unrevokable #1053

Closed zsaltys closed 6 months ago

zsaltys commented 7 months ago

Describe the bug

If you created a share using a manually created pivotRole then the resource link for that table will be owned by that role. pivotRole is the only role that can drop this resource link without additional permissions. Any other role that would want to drop this resource link would need to be granted explicit drop permissions.

If you then switch data.all from using manual pivot role to pivot role created with CDK (as we did when migrating from 2.2 to 2.3) and then attempt to revoke a table share that was created with the old role then the revoke will fail with error like this:

Failed to revoke S3 permissions to table gidmetadata from source account 123//us-east-1 with target account 123/us-east-1 due to: An error occurred (AccessDeniedException) when calling the DeleteTable operation: Insufficient Lake Formation permission(s): Required Drop on foo_table

This happens because the new pivotRole-cdk does not have drop permissions on the table because it is not the owner of that table.

My proposal would be to either run an upgrade script on all environments to update all resource links to get a drop permission or add a check whether the new role has a drop permission and if not to grant it to itself before dropping. In general I think it's a weird glitch that LF ask LakeFormation admins to have DROP before dropping.. I think that should be reported as a bug.

How to Reproduce

  1. create a table share using manually created pivot role
  2. switch data.all to use cdk pivot role
  3. attempt to revoke the table share

Expected behavior

Revoking should work after switching from manually created pivotRole to automatically created one.

Your project

No response

Screenshots

No response

OS

Linux

Python version

3.9

AWS data.all version

2.3

Additional context

No response

dlpzx commented 7 months ago

Hi @zsaltys thanks for opening an issue. This is actually an error that should be solved by the automated tasks that update the dataset stack and sync tables. Let us have a look and get back to you

dlpzx commented 7 months ago

Hi @zsaltys we confirm that this is a bug and we are working on a fix in https://github.com/data-dot-all/dataall/pull/1055. Instead of approaching it as a migration issue, we will add the grant as a robustness measurement to ensure even manual changes or inferences from customers do not result in failed revoked tables.

The PR adds the grant of DROP permissions as a preliminary check to the deletion of the resource link tables.

noah-paige commented 6 months ago

Implemented sharing guardrails as part of PR #1055 - closing this issue as complete, please re-open if any additional comments / concerns