catnekaise / cognito-idpool-auth

Authenticate with Amazon Cognito Identity from GitHub Actions.
https://catnekaise.github.io
MIT License
2 stars 5 forks source link

Support creating a `repository_name` claim when chaining #4

Closed iainlane closed 3 months ago

iainlane commented 4 months ago

GitHub doesn't provide a claim in its token, or an environment variable/context value containing just the repository's name. It does give the owner in repository_owner though.

Having this available as a tag will be very useful when constructing policies. For example, you could create a policy that allows repos to push to s3 buckets named like

mycompany-${aws:RequestTag/repository_owner}-${aws:RequestTag/repository_name}

WDYT?

As an alternative, I could imagine an input to allow any custom tags to be sent, but I'm not sure we want to go that far.

(thanks for this action by the way, it's super helpful!)

djonser commented 4 months ago

Thanks for your contribution and glad the action has helped you :)

Looking over documentation I have written regarding "passing claims" I believe I could have done a better job and simplifying this and will get around to it as soon as I wrap something else up.

The only claims that are relevant for being "passed" are those claims listed at https://token.actions.githubusercontent.com/.well-known/openid-configuration. Depending on the GitHub Actions execution context, these are the claims that may appear in the access token that is sent to Cognito Identity and its these claims that can be mapped by Cognito Identity to session/principal tags.

Passing claims means that during role chaining we tag the new session with new session/principal tags that exactly matches the existing tags. In order for the claims to remain valid and trusted, the trust policy on the role that is being chained have to look similar to below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111111111111:role/MyInitialCognitoRole"
      },
      "Action": [
        "sts:AssumeRole",
        "sts:TagSession"
      ],
      "Condition": {
        "StringEquals": {
          // (Optional) Ensure that the role was assumed via Cognito Identity
          "aws:FederatedProvider": "cognito-identity.amazonaws.com",
          // Ensure that we set the tag "repository" in the NEW session with the exact same value of "repository" tag in the CURRENT session
          // Without this condition, any value could be passed by the assumer
          "aws:RequestTag/repository": "${aws:principalTag/repository}",
        },
        "ForAllValues:StringEquals": {
          // Ensure that AssumeRole cannot set any other tags than those listed here
          "aws:TagKeys": [
            "repository",
          ]
        },
        "StringLike": {
          // Ensure that the value of tag "repository" in the current session is like:
          "aws:PrincipalTag/repository": [
            "catnekaise/example.*"
          ]
        }
      }
    }
  ]
}

Since the value of repository_name is calculated (and I really wish that this claim existed also just like repository_owner) it means that it can be manipulated and can no longer be considered a claim that can be trusted.

Resource Tag solution

Obviously s3 buckets does not support / characters in them so using the repository claim would not work. What you could do is place less value on the name of the bucket and instead use resource tags on the bucket.

{
  Actions: [
    "s3:GetObject",
    "s3:PutObject"
  ],
  "Resource": "arn:aws:s3:::mycompany-*/*",
  "StringEquals": {
    "aws:ResourceTag/githubRepository": "${aws:principalTag/repository}"
  }
}

Prefix Solution

If separate buckets per repository is not a hard requirement, a single bucket and policies based on prefix can also work.

{
  "Effect": "Allow",
  "Action": [
    "s3:GetObject",
    "s3:PutObject"
  ],
  // the prefix path will become /repositories/organization-name/repo-name/
  "Resource": "arn:aws:s3:::my-bucket/repositories/${aws:principalTag/repository}/*",
  "Condition": {
    "StringLike": {
      "aws:PrincipalTag/job_workflow_ref": "catnekaise/shared-workflows/.github/workflows/*@refs/heads/main"
    }
  }
}

If using repository_name

I'm not sure if I've had to match principal tags using request tags so I cannot say for 100% this would work (will test tomorrow).

If intent is to calculate repository_name, the way to have it remain secure in role-chaining would look like:

{
  "StringEquals": {
    // Ensure that we set the tag "repository" in the NEW session with the exact same value of "repository" tag in the CURRENT session
    "aws:RequestTag/repository": "${aws:principalTag/repository}",
    // Ensure that we set the tag "repository_owner" in the NEW session with the exact same value of "repository_owner" tag in the CURRENT session
    "aws:RequestTag/repository_owner": "${aws:principalTag/repository_owner}",

    // Here we compare that the CURRENT session tag "repository" would equal repository_owner/calculated_repository_name
    "aws:PrincipalTag/repository": "${aws:RequestTag/repository_owner}/${aws:RequestTag/repository_name}"
  },
  "ForAllValues:StringEquals": {
    // Ensure that AssumeRole cannot set any other tags than those listed here
    "aws:TagKeys": [
      "repository",
      "repository_owner",
      "repository_name"
    ]
  },
  "StringLike": {
    // Illustrate that repository_name tag cannot be trusted
    "aws:RequestTag/repository_name": "*"
  }
}

Summary

While the last example might work, repository_name (which I also wish was a real claim) cannot be made into a "passed claim" as it cannot be trusted on its own and would have to have custom policies that validate it. However, I've thought about adding some kind of support for like "additional session tags during role chain" and basically concatenate them with the passed claims. Doing it this way to make it very clear what is a passed claim and what is just a tag with any value.

- id: repository_name
  run: |
    # Calculate repository_name and output
- uses: catnekaise/cognito-idpool-auth@v1
  with:
    chain-role-additional-tags: "repository_name=${{ steps.repository_name.outputs.repository_name }}"
    ## Or
    chain-role-additional-tags-2: "Key=repository_name,Value=${{ steps.repository_name.outputs.repository_name }}"

Please keep the PR open for discussion.

Additional Resources

https://catnekaise.github.io/github-actions-abac-aws/detailed-explanation/ the following might helpful if you have not read it yet.

iainlane commented 4 months ago

Heya, thanks a lot for the detailed reply. For me, the main part is about how to trust the passed claims - do you agree?

My take is that all of the claims need to be verified (actually your guides helped me understand this). Even the ones we are simply passing straight through from Cognito to the chained role.

Otherwise, what could happen is a malicious actor manually runs the same steps that we're running in this action, but substitutes their values for the passed claims. If the sts:AssumeRole policy in the jump account isn't verifying these claims then we won't be able to stop that.

IMO this means that providing a way to pass any tags you want is opening up users to shooting themselves in the foot.

But at the same time, it's fine to pass derived tags as long as you can verify them from the original claims that came from GitHub. In this case it's like you said:

"Condition": {
    "Null": {
        "aws:TagKeys": "false"
    },
    "ForAllValues:StringEquals": {
        "aws:TagKeys": [
            "...",
            "repository_name"
        ]
    },
    "StringEquals": {
        "... checks for other tags passed ...",
        "aws:PrincipalTag/Repository": "${aws:RequestTag/repository_name}/${aws:RequestTag/repository_name}",
    }
}

So to sum up, all claims - even unchanged ones - need to be validated in IAM policies, and for repository_name in particular, since there's a way to verify it from the principal claims, it's acceptable for us to add it in.

The main part that's missing for me is documenting this explicitly. I'll go and add another commit to make this very clear now, and we can see what you think after that. (edit: done) Thanks again :+1:

djonser commented 4 months ago

Having had some more time to think it over I like your suggestion more as it simplifies usage and something like chain-role-additional-tags can be revisited in the future for truly additional tags.

This is also the single value that can be calculated among all claims and safely "passed", assuming the trust policy is correctly configured. This action has no control/insight to trust policies as that responsibility is with the user and this action is here to simplify usage for the user.

I don't believe GitHub will add repository_name as a claim since that makes it easier to make mistakes due to missing validation of the owner. That said, if they do I don't see any problems that cannot be solved because of it.

The Pull Request / Implementation

What I had like is help the user so it becomes harder to gloss over the need to validate repository_name.

Add a warning

Add a warning if passing repository_name while not also passing repository and repository_owner. This does not guarantee that the trust policy is correctly configured but should help.

  # implement a check
  # Add some language
  echo "::warning::The tag `repository_name`.... Read more https://github.com/catnekaise/cognito-idpool-auth#passing-repository-name"
  value="${GITHUB_REPOSITORY##*/}"

Documentation

{
  "StringEquals": {
     // Validate repository_name
    "aws:PrincipalTag/repository": "${aws:RequestTag/repository_owner}/${aws:RequestTag/repository_name}",
     // Add if wanting to continue using the claim
    "aws:RequestTag/repository": "${aws:PrincipalTag/repository}"
  }
}

Let me know what you think and if you need anything else.

edit: updated condition to check RequestTag instead of PrincipalTag for repository_owner so that tagging repository_owner is not optional saving a row in the policy. Make it clear that also tagging repository is only if wanting to continue using that claim as is.

iainlane commented 4 months ago

Ok, thanks again for the feedback. I think I've addressed those requests. Here's a test run which issued a warning. It's kind of interesting since you don't need to pass those claims to be able to validate them, but you do need to map them in the ID pool. I don't know if that can be checked directly though. I played around a bit to try to find a way to access the role session tags for the jump role but couldn't do it. So this might be the best thing to do at least for now.

Just so you know, I won't be able to make any more updates for a few days. Feel free to push to this branch and merge it if you can, or check it out and make more changes on top.

Cheers!

djonser commented 4 months ago

There's no way that I know of to get the current session tags in real-time. You can see whatever tags you set during AssumeRoleWithWebIdentity or AssumeRole in CloudTrail.

A little bit unsure what you meant by "It's kind of interesting since you don't need to pass those claims to be able to validate them, but you do need to map them in the ID pool".

Given a claim mapped by Cognito Identity that sets the repository tag to catnekaise/example-repo, then a condition on the trust policy in the jump account as shown below would require setting the tag repository_owner (${aws:RequestTag/repository_owner}) to value catnekaise and then setting the tag repository_name (${aws:RequestTag/repository_name}) to value example-repo.

Not setting either of these tags or using a different value would fail the validation.

{
  "aws:PrincipalTag/repository": "${aws:RequestTag/repository_owner}/${aws:RequestTag/repository_name}"
}

If changing ${aws:RequestTag/repository_owner} to instead be ${aws:PrincipalTag/repository_owner}, then there's no requirement to set the tag (passing the claim) for repository_owner as we would have used the current correct value of the principalTag.

In the end this action can only help guide the user with documentation and a warning like now so that hopefully fewer configuration mistakes is made. It's entirely possible to configure identity federation in a way that grants unintended access.

Pull Request

Great work by the looks of it. I will perform thorough testing myself as soon as I get a chance and there's no rush getting this merged in case there's a small change required, but I will go ahead and merge if everything checks out. Have a great weekend!.

djonser commented 4 months ago

I've had a chance to test this out.

The following values set in chain-pass-claims must continue to work and the last 4 shall trigger the warning.

repository_name,repository,repository_owner
repository, repository_owner=owner, sha=commit
repository,repository_owner
repository_name,repository
repository_name,repository_owner
repository_name=repoName,repository=repo
repository_name,repository_owner=owner

Issue

The following change causes a problem as the individual claims cannot be correctly known here if the user have mapped claims to different tag names.

declare -A claims
while IFS=', ' read -r claim; do
  claims[$claim]=1
done <<< "$PASS_CLAIMS"

See the table here. In short, if setting value of chain-pass-claims to repository=foo, then tags set in AssumeRole would become Key=foo,Value=catnekaise/example-repo. Maybe not the most noticeable feature in the documentation but a feature that cannot break.

I see two options.

Option 1

Reset the changes above the original loop except addition of repository_name and run another loop just prior to the existing loop and populate the associative array.

declare -A claimsSet

for c in "${claims[@]}"; do
  IFS='=' read -ra parts <<< "$c"
  claim=${parts[0],,}
  claimsSet[$claim]=1
done

tags=()

for c in "${claims[@]}"; do
  # loop
done

Option 2

Reset the changes above the original loop except addition of repository_name, declare claimsSet and populate it in the existing loop and move to check the condition after the loop.

declare -A claimsSet

for c in "${claims[@]}"; do

  IFS='=' read -ra parts <<< "$c"
  claim=${parts[0],,}
  tagName=${parts[1]}
  claimsSet[$claim]=1

  # Rest of loop....

done

if [[ -v claimsSet[repository_name] ]] && ! [[ -v claimsSet[repository] && -v claimsSet[repository_owner] ]]; then
  echo "::warning::repository_name is being passed, but repository and repository_owner are not. You need to pass both in order to validate repository name in your IAM policy. Read more: https://github.com/catnekaise/cognito-idpool-auth#passing-repository-name"
fi

echo "value=${tags[@]}" >> "$GITHUB_OUTPUT"

Summary

The docs look good and I will get around to cleaning up the overall docs later after the PR is merged.

I'm OK with either option and open to other suggestions. I also have a working context of these changes now and should be able to quickly respond and merge the PR once it works.

iainlane commented 3 months ago

Alright, back now. Cheers for spotting that. I missed that feature.

Care to take a look at f91d9c03582e865c94c5c884c8014eb9a55e9c74? That should address the concern as far as I can tell. We save the original pass_claims value in the values of the associative array, and have the keys be the plain (non mapped) claims. Then we can continue iterating over the values as before.

The only thing that wouldn't work is if the user has mapped the same claim twice, as in:

repository_name=foo, repository_name=bar

we would just use bar. I'm not sure if that is something we need to support.

Here are a couple of runs on this commit. One with chain-pass-claims: repository_name. showing it warns. And one with chain-pass-claims: repository_name, repository=foo, repository_owner. Don't worry that it fails, that's just because the role has no permissions. The important thing is that we tried to assume it - i.e. we passed the validation.

Looking forward to seeing if your testcase works too.

Passing claims and when it's strictly needed

A little bit unsure what you meant by "It's kind of interesting since you don't need to pass those claims to be able to validate them, but you do need to map them in the ID pool".

Just something I thought about while writing this stuff. Strictly, you do not have to have repository and repository_owner in chain-pass-claims to be able to validate it properly. It is enough to have it mapped in the Cognito ID provider and not passed, like this:

image

This means that you get the claims available for use in aws:PrincipalTag (tags on the role session that Cognito creates for us). So you can write a policy like:

{
  "StringEquals": {
    "aws:PrincipalTag/repository": "${aws:PrincipalTag/repository_owner}/${aws:RequestTag/repository_name}",
}

in that role's sts:AssumeRole policy (controlling when the Cognito-assumed role can further assume the next one), to verify that repository_name is fine, and just chain repository_name onwards.

That's only if you don't want to use repository or repository_owner in the final role's policies. For example, maybe you already checked repository_owner is your org in the trust policy for the jump account, so there's no need to pass it on and check that again.

But as far as I can tell we can't check that in this action, so what we're doing here is probably safest, even though we might make people pass "too much". It's pushing them towards doing the right thing. All of this stuff is very complicated :grimacing:

djonser commented 3 months ago

Great. Works for me. The warnings fire when they should fire and looking in both GitHub Actions logs and CloudTrail everything is in order when I test.

I would say the general implementation is approved but you are right and maybe a slight change to the warning could be warranted. Read below.

Passing claims when strictly needed

It's true that both do not need to be passed. One of my earlier replies for a policy suggestion had the example below. By referencing aws:RequestTag/repository_owner it is forced to be passed. .

{
  "StringEquals": {
     // Validate repository_name
    "aws:PrincipalTag/repository": "${aws:RequestTag/repository_owner}/${aws:RequestTag/repository_name}",
     // Add if wanting to continue using the claim
    "aws:RequestTag/repository": "${aws:PrincipalTag/repository}"
  }
}

In any situation I think that either repository_owner (StringEquals) or repository (StringLike) should also be passed and the warning is about informing the user to not assume which GitHub organization the repository exists in after role-chaining.

A company running GitHub Enterprise may have multiple legit GitHub organizations and one or more sandbox organizations in where they allow users to test stuff out so there's the consideration of org-a/backend-service-1 and org-a-sandbox/backend-service-1.

I believe a final and small change to the warning logic that only triggers when neither repository_owner or repository is passed and a text similar to:

"::warning::repository_name is being passed, but neither repository or repository_owner is. You should pass one of them if wanting to match the GitHub organization in IAM your IAM policies. Read more: https://github.com/catnekaise/cognito-idpool-auth#passing-repository-name".

Do not worry about any documentation changes, I will get to it and expand on the subject of repository_name. Also need to implement support for it in my little cdk library.

Great effort and communication over all. So much for something so "little" but it's as you stated, pushing people to consider these things as errors in this can have great consequences. My expectation has always been that those that use this action either forks/copies the yaml file into their own repository, and in doing so, they could remove that single like of warning with ease if they find it annoying.

Summary

I can approve the PR and merge it as soon as I see it and can do it during $DAYJOB as its just a 5minute thing.

iainlane commented 3 months ago

Alright then, thanks once more. I've pushed 41ced8eea6d48833dd146c51f52fa784c9be863b which I think/hope implements the latest & last suggestion :pray: :wink: