loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.26k stars 398 forks source link

Schedule Pod with specific Serviceaccount #370

Closed MisterMX closed 2 years ago

MisterMX commented 2 years ago

Is your feature request related to a problem?

To use IRSA (IAM roles for serviceaccounts) on AWS EKS it is required to run a pod with a service account that has the eks.amazonaws.com/role-arn annotation set so EKS will mount the AWS_WEB_IDENTITY_TOKEN_FILE into the pod.

However, VCluster always uses the same service account it has been started with (i.e. default).

Which solution do you suggest?

Provider an option that allows a user in the virtual cluster to override the service account name on the physical pod. This easiest way would be to add an annotation i.e. vcluster.loft.sh/override-service-account-name whose value will be used to set spec.serviceAccountName.

Which alternative solutions exist?

No response

Additional context

No response

FabianKramm commented 2 years ago

@MisterMX thanks for creating this issue! We have a syncer flag called --service-account to specify with which service account the started pods should be running with, you can use that via creating a values yaml:

syncer:
  extraArgs: ["--service-account=my-service-account"]

And then create the vcluster with vcluster create ... -f values.yaml

MisterMX commented 2 years ago

@FabianKramm thanks for the hint but this will apply to all pods on the cluster, won't it? We want only to modify certain pods. For example, in our use case, we only want to allow a specific deployment to have access to AWS via IRSA - for security reasons.

FabianKramm commented 2 years ago

@MisterMX yes thats true. I see your use case and I believe this is a valid point that we also heard several times before. I guess a special annotation introduced like in your PR would somehow solve this, but I believe there are also people that actually want to annotate the service accounts within the vcluster to get those AWS permissions on the pods then automatically, which would feel also more natural I guess.

We could do this by introducing a service account sync which would actually create a service account for each service account in the vcluster. This service account in the host cluster would then get assigned to the same pods as in the vcluster and hence host controllers like the AWS IAM rules controller would inject the tokens correctly. The only problem with this I see would be that you can essentially assign any AWS role from within the vcluster to any pod, but we could also say that this is the responsibility of the host cluster to verify via admission control that there are no service accounts created in the vcluster that get certain AWS roles assigned.

What do you think of this?

MisterMX commented 2 years ago

@FabianKramm do the generated physical SAs get a predictable name? This is relevant because it has to match the principal specified in the IAM policy to get IRSA to work.

FabianKramm commented 2 years ago

@MisterMX yes it has, although longer names will be a little bit tougher to compute, but the service account name is always predictable from the virtual cluster name, virtual service account name and virtual service account namespace through the following logic:

var Suffix = "VIRTUAL_CLUSTER_NAME"

func PhysicalName(name, namespace string) string {
    if name == "" {
        return ""
    }
    return SafeConcatName(name, "x", namespace, "x", Suffix)
}

func SafeConcatName(name ...string) string {
    fullPath := strings.Join(name, "-")
    if len(fullPath) > 63 {
        digest := sha256.Sum256([]byte(fullPath))
        return strings.Replace(fullPath[0:52]+"-"+hex.EncodeToString(digest[0:])[0:10], ".-", "-", -1)
    }
    return fullPath
}

In short this logic generates a physical name from a virtual cluster name with the following format:

<SERVICE_ACCOUNT>-x-<SERVICE_ACCOUNT_NAMESPACE>-x-<VIRTUAL_CLUSTER_NAME>

If that string is longer than 63 characters it will sha256 hash the string and will use the following format instead:

<52 Characters of previous mentioned string>-<First 10 Characters SHA256 Hash of previous mentioned string>
MisterMX commented 2 years ago

@FabianKramm cool, thanks for the clarification!