data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
221 stars 78 forks source link

Restrict Glue and KMS IAM permissions for pivot role #1189

Open mourya-33 opened 2 months ago

mourya-33 commented 2 months ago

Describe the bug

The auto created pivot role has the following unrestricted IAM permissions for Glue that are flagged by checkov scans. The permissions need to be restricted to the required resources only instead of '*'.

{ "Action": [ "glue:BatchCreatePartition", "glue:BatchDeletePartition", "glue:BatchDeleteTable", "glue:CreateDatabase", "glue:CreatePartition", "glue:CreateTable", "glue:DeleteDatabase", "glue:DeletePartition", "glue:DeleteTable", "glue:BatchGet", "glue:Get", "glue:List", "glue:SearchTables", "glue:UpdateDatabase", "glue:UpdatePartition", "glue:UpdateTable", "glue:TagResource", "glue:DeleteResourcePolicy", "glue:PutResourcePolicy" ], "Effect": "Allow", "Resource": "", "Sid": "GlueCatalog" },

How to Reproduce

scan the auto created pivot role cloudformation template file or the custom pivot role cloudformation template file with checkov. The checkov scan will fail with the following message.

Check: CKV_AWS_111: "Ensure IAM policies does not allow write access without constraints" FAILED for resource: AWS::IAM::ManagedPolicy.PivotRolePolicy{Hash}

Expected behavior

The Glue IAM permissions on pivot role must be restricted to the required resources only. Once a glue resource (table/dataset, db etc) is added/removed in dataall, the pivot role should be updated to add/remove the glue resource as necessary.

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.10

AWS data.all version

2.3

Additional context

The issue relates to https://github.com/data-dot-all/dataall/issues/875.

zsaltys commented 2 months ago

@dlpzx @noah-paige these should managed the same way like bucket permissions .. data.all should only have access to databases that have been imported and nothing else.

dlpzx commented 2 months ago

Hi @zsaltys @mourya-33 this case is slightly different than S3 bucket IAM policies. In data.all we assume that the Glue databases/tables are governed by LakeFormation, in such case even with, for example "glue:DeleteDatabase" IAM permissions, the pivot role is not able to delete the database if it is lacking those LakeFormation permissions.

All data.all created or imported datasets S3 locations are registered in LakeFormation, so IAM permissions would not be restricting anything on those S3 locations Glue resources. We are trying to be frugal with the IAM permissions of our IAM roles to stay away from service quotas without compromising least-privilege.

The only case where this feature adds value is if there are S3 locations with Glue resources that are NOT governed by LakeFormation in the AWS account and you want to restrict access. If that is the case, we can implement the above or explore newer features of Lake Formation such as hybrid access.

If your case is the last one, let's see how we can implement it in the most efficient way