yves-vogl / aws-eks-helm-deploy

Bitbucket Pipe for deploying Helm Charts to AWS Elastic Kubernetes Service
Other
8 stars 15 forks source link

Add OIDC environment please #3

Closed AndriySidliarskiy closed 10 months ago

AndriySidliarskiy commented 2 years ago

Create template for use OIDC with pipe

yves-vogl commented 2 years ago

Can you give me some more information of the feature you'd like?

darraghenright commented 2 years ago

Hi @yves-vogl — this is a feature that I need as well. It's a bit of a blocker for me so I can take a crack at this and submit a PR if that helps?

yves-vogl commented 2 years ago

Yes, of course. PR is welcome!

On 28. Jun 2022, at 14:13, Darragh Enright @.***> wrote:

 Hi @yves-vogl — this is a feature that I need as well. It's a bit of a blocker for me so I can take a crack at this and submit a PR if that helps?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

darraghenright commented 2 years ago

Cool, already looking into it!

I assume the OP wants is to be able to define the oidc: true flag and then specify a value for the OIDC role ARN — if both the flag and value are present, they would be used to auth instead of the access and secret keys.

There are other pipelines out there that do the same thing so there should be plenty of examples to provide guidance.

I'll get started on this and update ASAP.

darraghenright commented 2 years ago

Hi @yves-vogl

I didn't find much helpful information in some of the Atlassian pipe projects (and one even nuked my ~/.aws directory when I ran the tests!).

So I looked at your code instead and the good news is that you've centralised auth under EKSClientFactory so I think we can make modifications there for our needs.

The most important point is that when a step includes oidc: true it will make a token available in as an envvar named BITBUCKET_STEP_OIDC_TOKEN available. So, we'll need to use this when available, using sts.assume_role_with_web_identity.

When it is available, we do not need AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but we will require an OIDC Role ARN. Since you already have a ROLE_ARN field I figure we can just reuse this?

I first modified the schema to change what is and what is required depending on if BITBUCKET_STEP_OIDC_TOKEN is present. I then modified EKSClientFactory to use this value when present.

I created a draft PR because I am guessing there's more work to be done — I'm new to Bitbucket pipelines and there might be some some boilerplate changes required, changes to documentation will be required and you also may want more tests. I had a hard time running pipe/test.py and test/acceptance/test_pipe.py — not sure if this is my env or if the tests are currently failing.

Anyway, I figured it's a good time to get some feedback and get a conversation going on what else is required. Thanks!

yves-vogl commented 2 years ago

Hi,

thank you so much for your work! I will have a look at this soon as I'm currently on vacation.

This was my first pipe, too. And I'm not the best python dude around but I guess we can get this together as a team 🤝

On 1. Jul 2022, at 12:35, Darragh Enright @.***> wrote:

 Hi @yves-vogl

I didn't find much helpful information in some of the Atlassian pipe projects (and one even nuked my ~/.aws directory when I ran a test!).

So I looked at your code instead and the good news is that you've centralised auth under EKSClientFactory so I think we can make modifications there for our needs.

The most important point is that when a step includes oidc: true it will make a token available in as an envvar named BITBUCKET_STEP_OIDC_TOKEN available. So, we'll need to use this when available.

When it is available, we do not need AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but we will require an OIDC Role ARN. Since you already have a ROLE_ARN field I figure we can just reuse this?

I first modified the schema to change what is and what is required depending on if BITBUCKET_STEP_OIDC_TOKEN is present. I then modified EKSClientFactory to use this value when present.

I created a draft PR because I am guessing there's more work to be done — I'm new to Bitbucket pipelines and there might be some some boilerplate changes required, changes to documentation will be required and you also may want more tests. I had a hard time running pipe/test.py and test/acceptance/test_pipe.py — not sure if this is my env or if the tests are currently failing.

Anyway, I figured it's a good time to get some feedback and get a conversation going on what else is required. Thanks!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

darraghenright commented 2 years ago

Sounds good! Talk then. Enjoy your holiday — hope the weather is better there than here!

darraghenright commented 2 years ago

Just to add — I realised the tests were failing because of MarkupSafe errors — this is a transitive dependency on Jinja2 version 2.x. There are two possible solutions here as I see it:

  1. Upgrade Jinja2 version from 2.x to 3.x
  2. Add MarkupSafe==2.0.1 to requirements.txt

I don't know if there are BC implications to the first option, so I am going to run with the second option in my PR for now.

darraghenright commented 2 years ago

I had some time to try out my changes in a simple pipeline today, and it became clear that I missed something and more work is required.

The sts_client client created in the this code in HelmPipe.run() also requires auth:

# Role Session Name is hardcoded to EKSGetTokenAuth
# I do not patch this method for compatibility reasons
sts_client_factory = STSClientFactory(session)
sts_client = sts_client_factory.get_sts_client(
    region_name=region_name,
    role_arn=role_arn
)

cluster = eks_client.describe_cluster(name=cluster_name)
token = TokenGenerator(sts_client).get_token(cluster_name)

self._create_kubeconfig(cluster, token)

STSClientFactory is imported from awscli.customizations.eks.get_token — I had a look and there's no way to use an OIDC token with this class.

I figured that this was used for a reason because of the comment. So I thought one possible thing we could do was to subclass this, override get_sts_client and add extra functionality to accept and use the OIDC token — for a proof of concept at least.

I did try this with an image built from my fork (code is not committed so not in draft PR) and it appeared to work. The code created a .kube/config file as expected. However at some point after that I received the following error, which suggests that Helm is having a problem connecting to the K8S cluster.

✖ Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials

The IAM role I am using has full access to EKS for testing purposes so I am a bit confused as to what's happening there.

Going to leave findings there and maybe we can chat in due course.

yves-vogl commented 2 years ago

I updated all of the components to the latest version and pulled in your changes here. I will now review them and read your comments.

Thank you so much - also for your patience. I'm allowed to spend some time on open source at work but I guess it will take some time to get this done.

yves-vogl commented 2 years ago

Alright, I had a first look. It's been a while I stuck my head into this so I hope I remember my decisions done 2 years ago correctly ;-) And please feel free to correct me if I'm talking crazy stuff because it's also been a while since I worked on this topic (I'm doing mostly Azure CAF the last time).

Enough apologies, let's get to topic.

First of all we should clarify where the OIDC magic should happen because at the moment we need to authenticate twice. I try to explain:

Here I create a session which is used twice for different purposes

First in eks_client_factory which is only used to get information from an AWS resource level perspective, e.g. to get its endpoint, x509 and stuff like this. I'm not sure if this is a (direct) communication with Kubernetes somehow but I guess not. It seems to me like a conversation with the AWS API.

Next it's used in sts_client_factory which solely purpose is to get the id_token which is needed for authenticating to Kubernetes. That's at least some interaction which AWS IAM.

So when now introducing OIDC we need to decide if we want to be able to have different identities. One identity to talk to AWS to get cluster information. And one identity to authenticate to Kubernetes itself.

At the moment I guess that providing the cluster information by configuration is easier than using two identities. A single identity with OIDC can just be used if it's not just known to K8s but also integrated into the AWS layer (think of AWS SSO).

The typical use case I'd imagine is having a 3rd party IdP or something like a AWS Cognito user pool for using with K8s.

Can you elaborate a little bit on your use case? I'll then try to rebuild the environment and understand what's going on exactly before adapting this to a solution here.

My best guess would be the following scenario:

Someone is building EKS and integrating it with e.g. AWS Cognito. You are someone who is represented in the Cognito user pool and now wants to access EKS. If that's the case I'd provide a way to pass in all the K8s related information (optionally specifying a identity which could retrieve that) and then giving the token to the pipe so that it can connect the dots.