Community::SSO::AssignmentGroup

OlafConijn commented 3 years ago

The AWS::SSO::Assignment resource requires users to declare a resource for each Principal-Account-PermissionSet combination. This is not very practical as it might lead to a large number of CloudFormation resources.

Proposal to create a resource to do all assignments per principl in a single resource:

Type: AWS::SSO::PrincipalAssignments
Properties: 
  InstanceArn: String 
  PermissionSets: List<String> # can be either a list ARNs or logicalId's
  PrincipalName: String # name of the principal instead of the Id
  PrincipalType: String # GROUP | USER
  TargetAccountIds: List<String>
  TargetOuIds: List<String>

Based on the InstanceArn, PrincipalType and PrincipalName the PrincipalArn can be created:

InstanceArn is used to look up the IdentityStoreId (aws sso-admin list-instances)
based on PrincipalType either list-groups or list-users is invoked to look for a principal of name PrincipalName (aws identitystore list-users --identity-store-id xxxxxxxxx

PermissionSets can be created by !Ref to AWS::SSO::PermissionSet or !GetAtt resource.PermissionSetArn (but why would you want to do that?)

For now i think the TargetAccountIds / TargetOuIds seems like a fair startingpoint. Later maybe create an OrganizationBinding type?

benkehoe commented 3 years ago

I'd suggest attempting to collaborate with the AWS SSO folks to try to align this so it can be contributed to the official repository, rather than intending to have it be a separate community resource indefinitely.

I've written code for this purpose (under the Apache 2.0 license), which may be of use if you like. I wrote it as a client-side generator for the official AWS::SSO::Assignment resources, rather than as a resource provider, for the above reason. https://github.com/benkehoe/aws-sso-cfn-helper

My ideal interface for this doesn't start with the opinion that there's only one principal, and looks more like AWS::SSO::Assignment's fields.

Type: AWS::SSO::AssignmentSet
Properties:
  InstanceArn: <arn>
  Principals:
  - PrincipalType: GROUP/USER
    PrincipalIds: <list of strings>
  PermissionSetArns: <list of strings>
  Targets:
  - TargetType: AWS_ACCOUNT/AWS_OU
    TargetIds: <list of strings>

I actually wouldn't go with PrincipalName here, unless you are persisting the PrincipalId alongside it. The thing is, the name->principal id mapping can change, and that introduces a security issue as someone can manipulate principals to get a policy attached to something you're not expecting. This is why IAM translates principal ARNs into unique IDs for those principals, so that if the principal is deleted and another one of the same name is created, they are distinct entities.

You could have PrincipalName as a create only field, I guess.

OlafConijn commented 3 years ago

I've written code for this purpose (under the Apache 2.0 license), which may be of use if you like

Yeah! i did have a look and it helped me tremendously in understanding how the identitystore/sso-admin api's work. thanks for sharing this.

My ideal interface for this doesn't start with the opinion that there's only one principal

I thought about this too. The issue i believe you will run into is that it becomes hard to understand/track which CFN resource 'owns' what SSO resource. You might get into the situation that multiple 'AssignmentSets' modify the same resources in AWS.

Arguably this is also the case if you have multiple types AWS:: and Community:: managing the same resource, but thinking each specific resource type it makes sense to keep them resource oriented. Practically: the primaryIdenitifier could be based on principalId and CFN already manages most of this for you.

About the PrincipalName, i ran into another cornercase: principals synched through a 3rd party Idp cant be listed through the aws identitystore apis. There then are ways to figure out the PrincipalId regardless and i hope this will also work (- yet to test it though).

Your cornercase indeed also is valid. AWS SSO is broken, will indeed try to reach out to the team over at AWS and share the solution we built here if we do. Thanks!

benkehoe commented 3 years ago

The issue i believe you will run into is that it becomes hard to understand/track which CFN resource 'owns' what SSO resource.

I think that's true in any case where the CloudFormation resource corresponds to multiple API-level resources. I may have the same principal in multiple of these resources, maybe even in different stacks, managed by different folks. My thinking is, to follow the CloudFormation resource provider contract, it will refuse to create any assignment set where any of the assignments already exist.

OlafConijn commented 3 years ago

Must say i sometimes find it difficult to put real value on the contract. Every once in while @eduardomourar sets me straight and talkes some sense into me. Some of the value lies in expectations users have, some in how CFN can be used to do most of the hard work and some might lie in the future roadmap of the CFN team... What you describe would indeed be part of my expectations too, but In your resource design (any cfn resource could be a group of any association) I think it wont be possible to e.g. implement a Read handler (used for something like drift detection). more info here: https://docs.aws.amazon.com/cloudformation-cli/latest/userguide/resource-type-test-contract.html.

May i ask a different question though... You describe your ideal interface is something all combinations are possible, but i wonder how would you use this practically? I assume that you won't create a single cfn resource in which all principals get all roles in all accounts. If someone would try that out of the blue my advice would be to create a single group (but that is me reinforcing my own thoughts on my own design).

If you have different cfn resources, would you organize these around either account, permission set or principal? And what if you would have to pick one.... If you wouldn't? would not picking one at the expense of drift detection be a fair tradeoff?

I think it is fine for APIs to be a little bit opinionated because this typically also helps developers structuring their solution. I think that if in some cases the design indeed leads to code duplication that just also really is part of the contract we all have with CFN 😂 maybe solve this by parameters or CDK.

I did already adopt a couple of other things from your design though! thanks! Also really appreciate the discussion i think these are interesting and sometimes difficult tradeoffs to make.

benkehoe commented 3 years ago

For me, there are often groups of permissions that are given. A couple of teams get RO/RW access on a set of accounts. Sure, I could choose one of those lists to split and do one resource per, but logically they are all related.

As for a read handler and drift detection, again I don't think it's solved by having one of the three items with N=1. For principals with a lot of access, I wouldn't expect to put all of their permissions in one resource. I'd have separate resources for the various different logical groups of privilege they have (admin access, access like standard users, etc.). So I don't think a resource that assumes ownership of all the principal's assignments makes sense either.

For drift detection, I would think it would only deal with subsets of the assignments it defines. So it would detect if assignments defined in the resource are missing, but not detect if other assignments are defined. I think this would be true regardless if it was one principal or multiple.

I am coming around a bit to the idea, though, in particular for what the physical resource id should be (for an assignment, which doesn't have its own id, it's just the concatenation of the constituent ids). I was thinking the hash of all the constituent ids, but if one of the types has N=1, it could just be that.

A radical alternative: what if this was a CloudFormation Macro instead, decomposing into actual Assignment resources?

OlafConijn commented 3 years ago

yes you are right: Started to use the type myself and indeed realized that taking ownership of all PrincipalId's is too rigid. doesn't work. Fixed that by adding a UUID to the primaryIdentifier.

The question really is about:

Conforming to the contract tests: e.g: A create handler MUST return FAILED with an AlreadyExists error code if the resource already existed prior to the create request..
Being able to implement the Read handler (I assume at some point used for drift detection). Note that the Read handler only gets the primaryIdentifier and additionalIdentifiers but not the desiredResourceModel.

Both are possible if you fixate 1 (or 2) of 3 dimensions (1= too rigid we already found out.). A workaround might indeed lie with concatenating the other id's but sure feels like a workaround. Hashing them is non-reversible and i dont understand how that would help?

I also struggle withe the value of adhering to the cfn contract. but as some of it might lie ahead of us in the roadmap of the cfn team... maybe @rjlohan can chime in?

Having a CloudFormation Macro seems a bit like client side generation except for using lambda as a runtime :). This type of things should be solvable in CFN.. its the bricks and mortal of pretty much any tool/service we use (cdk, sls, etc.)

P.S. radical different question: if i do create a design of SecurityHub resources.... would you enjoy arguing about that one with me too? thanks!

benbridts commented 3 years ago

I would look at the behavior of the following resources for parallels (I think R53 is probably the closest):

Route53
SecurityGroup (rules)
IAM policies

Some things that might help:

List handlers are not a requirement
Read handlers are, so there should be a way to go from identifier to properties (I don't think making everything a writeOnlyProperty will help)

benkehoe commented 3 years ago

I agree that a resource provider is preferable to a macro, but a macro solves all the handler+identifier problems in one fell swoop.

This read handler business is tricky. Even with your concatenated UUID, you don't know which perm sets and accounts to pick up for a given resource when reading.

It's very ugly, but we've solved tricky problems in other resources by requiring a separate account-level stack installation. (In particular, for S3 bucket notifications, an SQS FIFO queue to serialize access to the bucket). The reverse lookup for the hash could be stored there.

What's the max length of a primary id? Could a limit on the size of these lists mean we could cram all the IDs into the primary id?

As to SecurityHub resources: I always enjoy arguing about CloudFormation resources 😄

benbridts commented 3 years ago

What's the max length of a primary id? Could a limit on the size of these lists mean we could cram all the IDs into the primary id?

It looks like it's 1024 characters. If you have a combined primaryIdentifier, the | to join them is also counted. So in practice:

 1025 - len(/primaryIdentifier)

OlafConijn commented 3 years ago

@benkehoe, to round up the options:

serializing data in the primaryIdentifier (workaround)
storing state somewhere else (give the resource a name > make it resource oriented towards the name)
not implementing the read handler
fixating 2 out of 3 dimensions

I really hate to have to provision more infra, but i like 2 best as a generic solution: Require people to create an S3 and Ref to it. Would also be something to, eventually, create a formal API on from within the CFN roadmap/team and store the state in the 'aws-infra account`.

it would end up looking like the following:

All resources expected to have the name RPStateBucket
CRUDL implemented based on Name
Generic pattern
Maybe lobby over at the CFN team to put this on the backlog? it that something i could help out with? share deign/experiences? create a 1pager? write a PR-QA document 😂

  AdminToProductionPrincipalAssignments:
    Type: Community::SSO::PrincipalAssignments
    Properties:
      Name: admin-to-production
      RPStateBucket: !Ref myBucketCallledBen
      InstanceArn: !Ref myInstanceArn
      PermissionSets:
      - !Ref AdminPS
      - !Ref ReadOnlyPS
      Principals:
      - PrincipalId: 123123123123123
        PrincipalType: GROUP
      Targets:
      - TargetType: AWS_OU
        TargetIds:
        - !Ref ProductionOU

Not the first time we run into Create semantic issues. e.g. with S3AccountPublicBlock (which can be created from multiple regions) should formally throw if the resource (because there is a resource you can create/delete/describe) already exists to prevent CFN from managing the resource from different regions. Downside is that if you created the S3AccountPublicBlock from within the account migrating to CFN would fail as it would require you to manually delete all these resources.

The solution @eduardomourar and I thought up in this context is another (IMHO) nice solution:

add a StrictMode attribute, default to false. if StrictMode is false then Create while the resource exists does not throw. Otherwise it will throw (adhere to the contract for RPs).

I ended up liking the design with the S3 bucket... well, it has a positive side as it does seem like a solution to more than 1 problem. thanks ben!

@eduardomourar, what do you think?

OlafConijn commented 3 years ago

To put everything together

Community::SSO::PrincipalAssignments gets a name attribute (required)
Community::SSO::PrincipalAssignments gets a StrictModel attribute (optional, default is false)
Community::SSO::PrincipalAssignments gets a RPStateBucket attribute (required in StrictMode)

For most people it will just work without too much hassle (but maybe things like drift detection wont). For those that want to benefit from adherence to the RP contract, you'll have to jump through an additional hoop.

a good compromise is something everybody is equally unhappy with 😂

eduardomourar commented 3 years ago

I don't like very much the S3 bucket option, even though it is just a workaround. I still have to investigate further, but I believe you can achieve a similar result by creating a separate resource type just for the principal grouping (with type and list of ids). There you can restrict the list based on the 1024 characters limit.

OlafConijn commented 3 years ago

Yep, also like that idea!

Seems like moving the same problem somewhere else, but if you can solve this in a somewhat scalable way.... impressive :)

eduardomourar commented 3 years ago

I have faced a similar primary identifier conundrum while implementing the Approval Rule Template resource. I have a few insights:

During a GetAtt, the Read handler is called only with the primary identifier and nothing more, not even additional identifiers.
In the case of this particular resource, because there is no direct API to retrieve info based on unique identifier, I had to loop every record and filter the matching identifier.
Another option for other scenarios is to have a primary identifier composed of multiple fields. The downside is that usually an Update handler will not be possible because you cannot replace the physical identifier.

benkehoe commented 3 years ago

Having a separate resource for the principal group doesn't seem to help? You still need to stuff that info in the assignment group primary id, or the read handler will still not work, because it doesn't get the field that would contain the principals.

I'm supportive of the S3 bucket idea. Why do we need a "strict model" attribute, when the presence of the bucket is indicative of this mode? What if it was a single field like BucketToEnableReadHandler or similar?

Now, if I understand correctly, the Name attribute is required because if you hash the contents for the primary id, it will change when the resource changes and that basically disables the update handler, so any update would cause all the assignments to get deleted and re-created?

OlafConijn commented 3 years ago

I ran into more cases where strict adherence to the RP contract was something you might want to opt-in for. In this case adherence to the contract (ReadHandler) requires users to create an extra bucket (therefore you might want to opt-in).

The Bucket here is a fix for a shortcoming for something maybe the CFN team might fix - would be great i think! The question could become: if you don't need the bucket (and bucket attribute) anymore - would you still like to have a StrictMode attribute? Probably not. If we optimize on the answer of this question that would lead to creating a single field BucketToEnableReadHandler.

My other question be whether this is the only scenario we might need a bucket for? cant think of anything atm but my guess is that having a way to store state is useful.

Ending up with a StrictMode and RPStateBucket attribute that don't effectively do anything (or need to be deprecated) to me seems a lesser evil than having different BucketToEnableReadHandler, BucketToConformToCreateSemantics, BucketToSupportXYZ attributes across different types in this github project.

Thus i kinda ended up with 2 different but generic patterns: StrictMode and RPStateBucket. makes sense?

Name attribute is there to become part of the physicalId. Maybe based on a hash of something if not provided by user (like Buckets, Topics, etc)

OlafConijn commented 3 years ago

Name for this resource should be different by now... something more like: Community::SSO::AssignmentGroup ?

Just notices something different too: you added a AWS_OU TargetType to your resource. What would you expect to happen if i added an account to the target OU? resource wont magically re-run. This is why org-formation is such a great tool 😂 .

Slipped in another resource in another issue. Not as complicated. SecurityHub resources can be found here: https://github.com/org-formation/aws-resource-providers/issues/27

benkehoe commented 3 years ago

Why wouldn't the physicalId just be the name? I don't think it needs to include anything else if the name is there.

I still don't understand why we would have a field named StrictMode that requires RPStateBucket but does not indicate why this would be so. Or what "strict mode" means. What about:

SomeNameMoreIndicativeOfFunctionality:
  Enabled: true
  StateBucket: !Ref MyBucket

I added OU because you had TargetOuIds in your original schema. Because of the way resources work, I would expect that it would only get updated if the resource inputs get updated. Until AWS SSO has OUs as targets, at which point it would just work. Note that a Macro would also solve this problem, because it would result in a different template after being applied.

I like ::AssignmentGroup. I do think it's weird that you're claiming the top-level Community:: identity. Shouldn't it be OrgFormation::?

OlafConijn commented 3 years ago

Why wouldn't the physicalId just be the name?

I thought of something ARN like as this indicates whether the resource is regional (contains region), accountId, type and name.

I do think it's weird that you're claiming the top-level ...

The idea was to be inclusive towards others that wanted to help out. I am very much looking forward to together work on making a solid and hopefully useful library of RPs.

In OrgFormation the resources started with my initials. This became a bit weird as there was an increasing number of others helping out and contributing. This project should not be about me at all (not sure whether that was a plural you or singular you).

I also think community approval is important. if prefixing the types with OrgFormation:: helps with that then don't have a problem doing so. The reason we decided not to do this is because the types are more general then OrgFormation:: they can be used without having to depend on any CLI tool, which is great.

this being perceives as weird makes me feel bad - definitely not the intention!

benkehoe commented 3 years ago

You could create a fake ARN syntax, but I'd pattern it after AWS SSO ARNs, which don't have account or region, so maybe arn:${Partition}:sso:::assignmentGroup/${InstanceId}/${AssignmentGroupName}? This would also nudge AWS SSO that they should create assignment groups as a proper thing 😁

I didn't mean to impugn your motives! I'm sorry to have come across like that. I just that think that the top-level token should be for deconfliction. What if someone else wants to create/fork their own ::SSO::AssignmentGroup? They would feel obligated to name it something other than Community::, which would make this project the de facto owner of the Community:: namespace. Or if everyone creating community resources uses Community::, and someone else decides to make a Community::X::Y in some other repo you don't know about, and you accidentally end up creating another one with the same resource type name? If each originator of a resource type has their own unique first token, we won't run into that problem. I think it's clear from the project that you intend this set of resource providers to be open and inclusive for people to contribute.

benkehoe commented 3 years ago

I ended up going the Macro route: https://github.com/benkehoe/aws-sso-util/blob/master/docs/cloudformation.md It had some benefits:

Every assignment is accounted for by CloudFormation, so even in a UPDATE_ROLLBACK_FAILED situation I know which ones are still present
I can add syntactic sugar to the AWS::SSO::PermissionSet resource as well, in particular automatically adding the InstanceArn, allowing the InlinePolicy to be specified in JSON/YAML instead of as a string, and specifying a default session duration.
The relevant code can be run locally on a template and generate the list of intended assignments as a CSV, which is useful for audit purposes.

OlafConijn commented 3 years ago

interesting! will definitely have a look. thanks for posting

eduardomourar commented 3 years ago

I know this is probably not relevant anymore to this discussion, but I found this blog post where the author used the Metadata section of a resource to store static data. It might be a nice trick for similar problems such as the SSO list.

org-formation / aws-resource-providers

Community::SSO::AssignmentGroup #26