Open anushka-singh opened 1 year ago
Thanks for opening the issue for visibility @anushka-singh. We will work together on introducing this feature because as you noticed it includes some scenarios out of the ones considedred in the iam_policy_utils
@dlpzx @SofiaSazonova @anmolsgandhi we ran into this issue on IAM policies when a role accumulated 38+ shares. Would be nice to prioritize this one.
I started an investigation into it and here are my findings so far...
When using data.all generated IAM roles we already use all the policies (max 10 per role) for services hence we cannot add anymore. We can assume that not all deployment will have all the features enabled and those 10 might be fewer and as such have some space but I don't consider this a good solution.
When using consumption roles we don't have those service policies hence in theory we have space for adding more managed policies but keeping in mind that those roles are not data.all managed and hence users might have already added their own policies.
Overall to tackle this issue we need to come up with a smarter (which probably means dynamic) policy manager which will maximise the capacity based on the following limitations
Hi @petrkalos ,
We are seeing IAM policy limit getting exceeded on a managed roles. I think an initial version where if the policy gets exceeded, another policy is created and attached would be helpful. I think data.all would somehow ( maybe ) have to remember the policy name it created and try to modify the policy which still has space left.
FYI , if needed the limit can be pushed to 20 managed policies per role- https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html#reference_iam-quotas-entity-length:~:text=Managed%20policies%20per%20role . Please correct me if I am wrong
[WIP]
Data.all creates managed roles for data.all constructs like Environment, Dataset. These managed roles contain managed and inline policies.
For environments and teams where roles are created by data.all and customer managed service policies are attached, these service policies are split into chunks ( splitting 10 statements into one managed policy ). Usually these service policies are split into managed policies as show below,
Now if there is some modification in the 1-indexed policy then that leaves the managed policy some empty space in which it could fill in some statements. Data.all will have to rearrange all those policies so that there is a tight fit ( i.e. minimal empty space in all the policies ).
Thus this problem is for statements = [s1, s2, s3, .... ] and their sizes ( weights ) = [size0, size1, size2, ... ]. Put statements in policies ( service-policy-0, service-policy-1, service-policy-2 ... ) such that Min(Empty Space left ).
This problems exists because the service policies are diverse and dynamically added. This doesn't allow us to efficiently split it into chunks like how it is done in functions like split_policy_with_resources_in_statements
, etc. But this problem is not present for managed policies used in dataset sharing. Here optimizations can be used to split the statements into chunks and create indexed managed policies.
Since the problem of managed policies getting full and resulting in "limit exceeded error" most likely happens for roles which are involved in share purposes. This update can be first targeted to all the consumption roles, environment team roles ( which are involved in sharing ). Currently there exists just one managed policy with data.all with the naming convention - <dataall-env>-<URI>-share-policy-<name of the role>
.
Following resources are added in requestors IAM role
SID for S3 - BucketStatementS3
s3_target_resources = [f'arn:aws:s3:::{self.bucket_name}', f'arn:aws:s3:::{self.bucket_name}/*']
SID for KMS - BucketStatementKMS
kms_target_resources = [f'arn:aws:kms:{self.bucket_region}:{self.source_account_id}:key/{kms_key_id}']
Following resources are added in requestors IAM role
SID - AccessPointsStatementS3
s3_target_resources = [
f'arn:aws:s3:::{self.bucket_name}',
f'arn:aws:s3:::{self.bucket_name}/*',
f'arn:aws:s3:{self.dataset_region}:{self.dataset_account_id}:accesspoint/{self.access_point_name}',
f'arn:aws:s3:{self.dataset_region}:{self.dataset_account_id}:accesspoint/{self.access_point_name}/*',
]
SID - AccessPointsStatementKMS
kms_target_resources = [f'arn:aws:kms:{self.dataset_region}:{self.dataset_account_id}:key/{kms_key_id}']
Here the requestor IAM role is not updated
When a share is gettting processed ( either for bucket or folder share )
Check if the share has an old format policy
Check if the share already has a "share-policy", then create a new one with an index and copy all statements from old to new policy. Delete the old policy.
Read the statements from all the policies and split the statements similar to the function split_policy_with_resources_in_statements
.
a. This function can be extended / modified in which depending on if its bucket share or access point share , it will pick up all the statements. belonging to an SID pattern ( [UPDATED]
b. Then the statements will be again split with AccessPointsStatement
or BucketStatement
).split_policy_with_resources_in_statements
.
c. Fetch all policy statement from all the managed policies.
d. Combine statement from (b) and (c) .
e. Then split statements into statement chunks with split_policy_statements_in_chunks
.
Then add statement chunks into the managed policies like here
for index, chunk in enumerate(statements_chunks):
policies.append(
iam.ManagedPolicy(
self.stack,
f'PivotRolePolicy-{index + 1}',
managed_policy_name=f'{self.env_resource_prefix}-pivot-role-cdk-policy-{self.region}-{index + 1}',
statements=chunk,
)
)
When a share is revoked,
Check if the share has an old policy
Check if the share already has a "share-policy" and create a new on with an index and copy all statements from old to new policy and delete the old policy
Fetch all the policy statements , remove the principals which have been revoked from the S3 and the KMS sids. Get all the statements and then again call the split function split_policy_with_resources_in_statements
.
Add the statements chunks into the managed policies like in Step 4 above.
Delete all the managed policies which are not used ( empty ).
Since we already have a managed policy, this will be converted to an index based managed policy with the name format as -
<dataall-env>-<URI>-share-policy-<name of the role>-<**index** (for e.g. 0, 1, 2,.. )>
While attaching all the policies which are of the above format will be picked and attached to the role.
In case if the role is not managed by data.all, then all managed policies which are created as a part of the share will be added on the copy clipboard button. The policies will be presented as array of policies when copied from the clip board icon.
Seems like the policies can be filtered based on PathPrefix
- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam/client/list_policies.html
This will involve migrating the existing policy into two managed policies like "<dataall-env>-<URI>-access-share-policy-<name of the role>-<index>
" and "<dataall-env>-<URI>-bucket-share-policy-<name of the role>-<index>
"
Pros:
Cons
@dlpzx , any insights on this one ?
Previous GH issues related to policy - https://github.com/data-dot-all/dataall/issues/922
Thanks for the detailed analysis @TejasRGitHub, I agree with the proposed solution, let's tackle ConsuptionRoles (and groups since it's very similar code) and think later about the rest. Also the indexing solution is already utilised and proven to work fine.
Hi @TejasRGitHub, very clear design. Answering to Q4 ---> luckily for us we do not rely much on Redshift IAM permissions in the data.all integration. The only role that gets additional Redshift IAM permissions is the PivotRole, the rest will get access to Redshift using directly Redshift connections and Redshift permissions in the cluster (outside of IAM, managed by database admins)
Is your idea related to a problem? Please describe. The limit for managed IAM policies is 6144 characters. Based on a dirty approximation, each team can request access to around 35 buckets based on the number of characters in policy statements.
Describe the solution you'd like In v2.1.0 we added some utilities in dataall/base/utils/iam_policy_utils.py. We can reuse them to limit the resources per policy and splitting policies.