data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
228 stars 82 forks source link

Pivot Role nested stack failing due to already present Lake formation service linked role #993

Closed TejasRGitHub closed 7 months ago

TejasRGitHub commented 7 months ago

Describe the bug

When migrating or bootstrapping environment with auto created pivot role ( dataallPivotRole-cdk ), the Pivot Role nested stack fails due to an already present lake formation service-linked role ( AWSServiceRoleForLakeFormationDataAccess ). This role is created when a S3 bucket is registered in Lake formation ( https://docs.aws.amazon.com/lake-formation/latest/dg/service-linked-roles.html ).

When boostrapping an AWS account into data.all it is certainly possible that this role is being used already.

How to Reproduce

Register a S3 bucket and mention this role for registering the S3 bucket into Lake formation Create a data.all environment with that AWS account with auto create pivot role setting ON. The environment creation should fail due to dataallPivotRole stack not able to create the AWSServiceRoleForLakeFormationDataAccess role as it is already present.

Expected behavior

The environment bootstrap should complete properly with the dataallPivotRole-cdk properly created.

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.9

AWS data.all version

2.2

Additional context

No response

noah-paige commented 7 months ago

I don't think we need to create this role as part of pivot role auto create because we use dataset role to register LF locations now. We could potentially remove this piece of code but first need to investigate if any unforeseen errors could come up with with pre-existing environments using this role in data.all or with setting admins in lakeformation using this role.

If we need to still create role and just check if it exists before can use .from_role_arn() (or .from_role_name()) similar to how we do with importing pivot role in env stack:

self.pivot_role = iam.Role.from_role_arn(
     self,
     f'PivotRole{self._environment.environmentUri}',
     pivot_role_stack.pivot_role.role_arn,
)

FOLLOW UP - from discussion with @dlpzx When we create the data.all Environment, we register the pivot role as a data lake administrator, without the existence of the LF service role this step would fail. At the time data.all was first developed we had to create this LF service role. Alternatively, customers could manually open Lake Formation in the AWS Console and the service role was created automatically. I think we should:

TejasRGitHub commented 7 months ago

@noah-paige , I wasn't able to find any problems with dataset import, dataset sharing without having the LF servicelinked role.

As pointed out by @dlpzx , removing the role is tricky and can cause problems. As this role is AWS managed service linked role and any aws account would have this created and they could be using this role, for existing data.all environments, ( which use the auto created pivot role ) and have this role in the cdk generated CF templates , if this role is removed, then the role will be deleted from the AWS accounts. This could cause issue with environment which use this role to do any operations on S3 buckets through lake formation.

In order to successfully remove this role, without letting this role get deleted, one way would be to apply the removal policy with RemovalPolicy.RETAIN.

Ways of doing this -

This service role deletion could be done over versions of data.all. For example, in v2.3, we add the code to apply the removal policy. Then in v2.4 , we could change the code to not include this role in pivot role stack. This will remove the role from CF of the pivot role in the update but won't delete the actual role.

Other options - Somehow access the CF template which is deployed for the pivot role and check if the role is present in the Pivot Role CF template. If the role is present, then add the RemovalPolicy.RETAIN policy. In the next update ( whenever the update-stacker runs ) , check if the CF template has the role and also has the removal policy, if yes then remove the role.

class PivotRole(NestedStack):
.. 
.. 
     template = CFTemplate.getTemplate(id: "")
     if template contains LakeFormationSLR and not RemovalPolicy statment:
        self.lf_service_role = iam.CfnServiceLinkedRole(
                self, 'LakeFormationSLR', aws_service_name='lakeformation.amazonaws.com'
        )
        self.lf_service_role.apply_removal_policy(RemovalPolicy.RETAIN)

I don't know if the above is possible, but would be a potential option if that works.

TejasRGitHub commented 7 months ago

PR for fix - https://github.com/data-dot-all/dataall/pull/999

noah-paige commented 7 months ago

Would be good to add a warning that the LF Service Role will be released on the upcomming release

Can re-use this issue to track maintenance UI view https://github.com/data-dot-all/dataall/issues/998 - out of scope for this PR but a good to have for upcoming release to include Notice