kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.89k stars 1.44k forks source link

[non EKS cluster] Failed build model due to NoCredentialProviders: no valid providers in chain #3216

Closed ksingh7 closed 1 year ago

ksingh7 commented 1 year ago

Describe the bug I am trying to make AWS LBC work on a non EKS cluster (precisely k3s cluster)

Steps to reproduce


# Deploying the AWS Load Balancer Controller

- Ensure subnets are tagged appropriately for auto-discovery to work (Drop This section if works without this)
Follow https://repost.aws/knowledge-center/eks-vpc-subnet-discovery

Tag Subnet  For public subnets used by external load balancers with 
Key: kubernetes.io/role/elb
Value: 1

- For IP targets, pods must have IPs from the VPC subnets. You can configure the amazon-vpc-cni-k8s plugin for this purpose.

brew install helm
helm repo add eks https://aws.github.io/eks-charts
cd 
helm install aws-vpc-cni --namespace kube-system eks/aws-vpc-cni --values aws-vpc-cni/values.yaml
kubectl get pods --namespace kube-system -l "app.kubernetes.io/name=aws-node,app.kubernetes.io/instance=aws-vpc-cni"

- Using the Amazon EC2 instance metadata server version 2 (IMDSv2)¶

aws ec2 modify-instance-metadata-options --http-put-response-hop-limit 2 --http-tokens required --region ap-south-1 --instance-id i-03dbc3a9aa29cd084

aws ec2 modify-instance-metadata-options --http-put-response-hop-limit 2 --http-tokens required --region ap-south-1 --instance-id i-0a1977cf777f9174d

- Create IAM policy for the AWS Load Balancer Controller, allowing it to make calls to AWS APIs on your behalf.
curl -o iam-policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.5.1/docs/install/iam_policy.json

aws iam create-policy  --policy-name AWSLoadBalancerControllerIAMPolicy  --policy-document file://iam-policy.json

- Create a role from AWS Console

aws iam attach-role-policy \
  --policy-arn arn:aws:iam::<ID>:policy/AWSLoadBalancerControllerIAMPolicy \
  --role-name AWSLoadBalancerControllerRole

cat >aws-load-balancer-controller-service-account.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/name: aws-load-balancer-controller
    app.kubernetes.io/instance: aws-load-balancer-controller
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/versionv: v2.5.2
  name: aws-load-balancer-controller
  namespace: kube-system
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<ID>:role/AWSLoadBalancerControllerRole
    meta.helm.sh/release-name: aws-load-balancer-controller
    meta.helm.sh/release-namespace: kube-system
EOF

kubectl create -f  aws-load-balancer-controller-service-account.yaml 

Additional Context:

Open Question to the developers : Does AWS LCB specifically requires EKS cluster ? i.e. non-EKS clusters are not supported ? In that case , AWS users who would like to manage their own K8s running on EC2 instances , how are they supposed to leverage AWS services like ALB, NLB, EBS and others directly from K8s running on EC2 instances ?

ksingh7 commented 1 year ago

Just for the record i have followed all the instructions mentioned on this webpage https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.5/deploy/installation/

ksingh7 commented 1 year ago
ksingh7 commented 1 year ago
{"level":"error","ts":"2023-05-28T09:07:09Z","msg":"Reconciler error","controller":"ingress","object":{"name":"fruits-ingress","namespace":"fruits-ns"},"namespace":"fruits-ns","name":"fruits-ingress","reconcileID":"8d1f4947-e320-48a1-8aff-81f3ea356f77","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
{"level":"error","ts":"2023-05-28T09:07:14Z","msg":"Reconciler error","controller":"ingress","object":{"name":"fruits-ingress","namespace":"fruits-ns"},"namespace":"fruits-ns","name":"fruits-ingress","reconcileID":"27695847-84ec-4ff5-83a7-f779f45d1130","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
{"level":"error","ts":"2023-05-28T09:07:24Z","msg":"Reconciler error","controller":"ingress","object":{"name":"fruits-ingress","namespace":"fruits-ns"},"namespace":"fruits-ns","name":"fruits-ingress","reconcileID":"56ad68f9-a1f8-4d1f-9670-1bbe7ac75945","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}

{"level":"error","ts":"2023-05-28T09:07:44Z","msg":"Reconciler error","controller":"ingress","object":{"name":"fruits-ingress","namespace":"fruits-ns"},"namespace":"fruits-ns","name":"fruits-ingress","reconcileID":"66f23e2b-1f62-4a9f-953c-5a2a1bb2b0ee","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
{"level":"error","ts":"2023-05-28T09:08:25Z","msg":"Reconciler error","controller":"ingress","object":{"name":"fruits-ingress","namespace":"fruits-ns"},"namespace":"fruits-ns","name":"fruits-ingress","reconcileID":"19b73364-433b-4e33-b3d6-004c759146d9","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}
ksingh7 commented 1 year ago

As i don't have EKS cluster so i have not done OIDC step defined in this guide https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html

  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ID<>:role/AWSLoadBalancerControllerRole
# Default values for aws-load-balancer-controller.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

replicaCount: 2

image:
  repository: public.ecr.aws/eks/aws-load-balancer-controller
  tag: v2.5.2
  pullPolicy: IfNotPresent

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

serviceAccount:
  # Specifies whether a service account should be created
  create: false
  # Annotations to add to the service account
  annotations: {}
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: aws-load-balancer-controller
  # Automount API credentials for a Service Account.
  automountServiceAccountToken: true
  # List of image pull secrets to add to the Service Account.
  imagePullSecrets:
    # - name: docker

rbac:
  # Specifies whether rbac resources should be created
  create: true

podSecurityContext:
  fsGroup: 65534

securityContext:
  # capabilities:
  #   drop:
  #   - ALL
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  allowPrivilegeEscalation: false

# Time period for the controller pod to do a graceful shutdown
terminationGracePeriodSeconds: 10

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 100m
  #   memory: 128Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

# priorityClassName specifies the PriorityClass to indicate the importance of controller pods
# ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: system-cluster-critical

nodeSelector: {}

tolerations: []

# affinity specifies a custom affinity for the controller pods
affinity: {}

# configureDefaultAffinity specifies whether to configure a default affinity for the controller pods to prevent
# co-location on the same node. This will get ignored if you specify a custom affinity configuration.
configureDefaultAffinity: true

# topologySpreadConstraints is a stable feature of k8s v1.19 which provides the ability to
# control how Pods are spread across your cluster among failure-domains such as regions, zones,
# nodes, and other user-defined topology domains.
#
# more details here: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
topologySpreadConstraints: {}

updateStrategy: {}
  # type: RollingUpdate
  # rollingUpdate:
  #   maxSurge: 1
  #   maxUnavailable: 1

# serviceAnnotations contains annotations to be added to the provisioned webhook service resource
serviceAnnotations: {}

# deploymentAnnotations contains annotations for the controller deployment
deploymentAnnotations: {}

podAnnotations: {}

podLabels: {}

# additionalLabels -- Labels to add to each object of the chart.
additionalLabels: {}

# Enable cert-manager
enableCertManager: false

# The name of the Kubernetes cluster. A non-empty value is required
clusterName:

# cluster contains configurations specific to the kubernetes cluster
cluster:
    # Cluster DNS domain (required for requesting TLS certificates)
    dnsDomain: cluster.local

# The ingress class this controller will satisfy. If not specified, controller will match all
# ingresses without ingress class annotation and ingresses of type alb
ingressClass: alb

# ingressClassParams specify the IngressCLassParams that enforce settings for a set of Ingresses when using with ingress Controller.
ingressClassParams:
  create: true
  # The name of ingressClassParams resource will be referred in ingressClass
  name:
  spec: {}
    # Due to dependency issue, the validation webhook ignores this particular ingressClassParams resource.
    # We recommend creating ingressClassParams resources separately after installing this chart and the
    # controller is functional.
    #
    # You can set the specifications in the `helm install` command through `--set` or `--set-string`
    # If you do want to specify in the values.yaml, uncomment the following
    # lines, adjust them as necessary, and remove the curly braces after 'spec:'
    #
    # namespaceSelector:
    #   matchLabels:
    # group:
    # scheme:
    # ipAddressType:
    # tags:
    # loadBalancerAttributes:
    # - key:
    #   value:

# To use IngressClass resource instead of annotation, before you need to install the IngressClass resource pointing to controller.
# If specified as true, the IngressClass resource will be created.
createIngressClassResource: true

# The AWS region for the kubernetes cluster. Set to use KIAM or kube2iam for example.
region: ap-south-1

# The VPC ID for the Kubernetes cluster. Set this manually when your pods are unable to use the metadata service to determine this automatically
vpcId: vpc-06cb5dfe6e8ba86ad

# Custom AWS API Endpoints (serviceID1=URL1,serviceID2=URL2)
awsApiEndpoints:

# awsApiThrottle specifies custom AWS API throttle settings (serviceID1:operationRegex1=rate:burst,serviceID2:operationRegex2=rate:burst)
# example: --set awsApiThrottle="{Elastic Load Balancing v2:RegisterTargets|DeregisterTargets=4:20,Elastic Load Balancing v2:.*=10:40}"
awsApiThrottle:

# Maximum retries for AWS APIs (default 10)
awsMaxRetries:

# Default target type. Used as the default value of the "alb.ingress.kubernetes.io/target-type" and
# "service.beta.kubernetes.io/aws-load-balancer-nlb-target-type" annotations.
# Possible values are "ip" and "instance"
# The value "ip" should be used for ENI-based CNIs, such as the Amazon VPC CNI,
# Calico with encapsulation disabled, or Cilium with masquerading disabled.
# The value "instance" should be used for overlay-based CNIs, such as Calico in VXLAN or IPIP mode or
# Cilium with masquerading enabled.
defaultTargetType: instance

# If enabled, targetHealth readiness gate will get injected to the pod spec for the matching endpoint pods (default true)
enablePodReadinessGateInject:

# Enable Shield addon for ALB (default true)
enableShield:

# Enable WAF addon for ALB (default true)
enableWaf:

# Enable WAF V2 addon for ALB (default true)
enableWafv2:

# Maximum number of concurrently running reconcile loops for ingress (default 3)
ingressMaxConcurrentReconciles:

# Set the controller log level - info(default), debug (default "info")
logLevel:

# The address the metric endpoint binds to. (default ":8080")
metricsBindAddr: ""

# The TCP port the Webhook server binds to. (default 9443)
webhookBindPort:

# webhookTLS specifies TLS cert/key for the webhook
webhookTLS:
  caCert:
  cert:
  key:

# array of namespace selectors for the webhook
webhookNamespaceSelectors:
# - key: elbv2.k8s.aws/pod-readiness-gate-inject
#   operator: In
#   values:
#   - enabled

# keepTLSSecret specifies whether to reuse existing TLS secret for chart upgrade
keepTLSSecret: true

# Maximum number of concurrently running reconcile loops for service (default 3)
serviceMaxConcurrentReconciles:

# Maximum number of concurrently running reconcile loops for targetGroupBinding
targetgroupbindingMaxConcurrentReconciles:

# Maximum duration of exponential backoff for targetGroupBinding reconcile failures
targetgroupbindingMaxExponentialBackoffDelay:

# Period at which the controller forces the repopulation of its local object stores. (default 1h0m0s)
syncPeriod:

# Namespace the controller watches for updates to Kubernetes objects, If empty, all namespaces are watched.
watchNamespace:

# disableIngressClassAnnotation disables the usage of kubernetes.io/ingress.class annotation, false by default
disableIngressClassAnnotation:

# disableIngressGroupNameAnnotation disables the usage of alb.ingress.kubernetes.io/group.name annotation, false by default
disableIngressGroupNameAnnotation:

# defaultSSLPolicy specifies the default SSL policy to use for TLS/HTTPS listeners
defaultSSLPolicy:

# Liveness probe configuration for the controller
livenessProbe:
  failureThreshold: 2
  httpGet:
    path: /healthz
    port: 61779
    scheme: HTTP
  initialDelaySeconds: 30
  timeoutSeconds: 10

# Environment variables to set for aws-load-balancer-controller pod.
# We strongly discourage programming access credentials in the controller environment. You should setup IRSA or
# comparable solutions like kube2iam, kiam etc instead.
env:
  # ENV_1: ""
  # ENV_2: ""

# Specifies if aws-load-balancer-controller should be started in hostNetwork mode.
#
# This is required if using a custom CNI where the managed control plane nodes are unable to initiate
# network connections to the pods, for example using Calico CNI plugin on EKS. This is not required or
# recommended if using the Amazon VPC CNI plugin.
hostNetwork: false

# Specifies the dnsPolicy that should be used for pods in the deployment
#
# This may need to be used to be changed given certain conditions. For instance, if one uses the cilium CNI
# with certain settings, one may need to set `hostNetwork: true` and webhooks won't work unless `dnsPolicy`
# is set to `ClusterFirstWithHostNet`. See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy:

# extraVolumeMounts are the additional volume mounts. This enables setting up IRSA on non-EKS Kubernetes cluster
extraVolumeMounts:
  # - name: aws-iam-token
  #   mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
  #   readOnly: true

# extraVolumes for the extraVolumeMounts. Useful to mount a projected service account token for example.
extraVolumes:
  # - name: aws-iam-token
  #   projected:
  #     defaultMode: 420
  #     sources:
  #     - serviceAccountToken:
  #         audience: sts.amazonaws.com
  #         expirationSeconds: 86400
  #         path: token

# defaultTags are the tags to apply to all AWS resources managed by this controller
defaultTags: {}
  # default_tag1: value1
  # default_tag2: value2

# podDisruptionBudget specifies the disruption budget for the controller pods.
# Disruption budget will be configured only when the replicaCount is greater than 1
podDisruptionBudget: {}
#  maxUnavailable: 1

# externalManagedTags is the list of tag keys on AWS resources that will be managed externally
externalManagedTags: []

# enableEndpointSlices enables k8s EndpointSlices for IP targets instead of Endpoints (default false)
enableEndpointSlices:

# enableBackendSecurityGroup enables shared security group for backend traffic (default true)
enableBackendSecurityGroup:

# backendSecurityGroup specifies backend security group id (default controller auto create backend security group)
backendSecurityGroup:

# disableRestrictedSecurityGroupRules specifies whether to disable creating port-range restricted security group rules for traffic
disableRestrictedSecurityGroupRules:

# controllerConfig specifies controller configuration
controllerConfig:
  # featureGates set of key: value pairs that describe AWS load balance controller features
  featureGates: {}
  # ListenerRulesTagging: true
  # WeightedTargetGroups: true
  # ServiceTypeLoadBalancerOnly: false
  # EndpointsFailOpen: true
  # EnableServiceController: true
  # EnableIPTargetType: true
  # SubnetsClusterTagCheck: true
  # NLBHealthCheckAdvancedConfig: true

# objectSelector for webhook
objectSelector:
  matchExpressions:
  # - key: <key>
  #   operator: <operator>
  #   values:
  #   - <value>
  matchLabels:
  #   key: value

serviceMonitor:
  # Specifies whether a service monitor should be created
  enabled: false
  # Labels to add to the service account
  additionalLabels: {}
  # Prometheus scrape interval
  interval: 1m
  # Namespace to create the service monitor in
  namespace:

# clusterSecretsPermissions lets you configure RBAC permissions for secret resources
# Access to secrets resource is required only if you use the OIDC feature, and instead of
# enabling access to all secrets, we recommend configuring namespaced role/rolebinding.
# This option is for backwards compatibility only, and will potentially be deprecated in future.
clusterSecretsPermissions:
  # allowAllSecrets allows the controller to access all secrets in the cluster.
  # This is to get backwards compatible behavior, but *NOT* recommended for security reasons
  allowAllSecrets: false

# ingressClassConfig contains configurations specific to the ingress class
ingressClassConfig:
  default: false

# enableServiceMutatorWebhook allows you enable the webhook which makes this controller the default for all new services of type LoadBalancer
enableServiceMutatorWebhook: true
M00nF1sh commented 1 year ago

@ksingh7 This controller works for non-eks clusters as well, and you need to grant the permission to the worker node instead. There isn't a built-in solution to let the controller assume a role for non-eks clusters.

I saw that you have modified the IMDSv2 hop to be 2, which shall works, would mind launch a debug pod(e.g. amazonLinux2), and install aws clis on it and try whether the pod can access aws credentials?

ksingh7 commented 1 year ago

Thanks for your response @M00nF1sh , appreciated. Also its such a relief to hear that LBC is for non-eks clusters as well , now i need to get it work , somehow

For the recored i have 2 EC2 instances (1 x master and 1 x agent (woker)), for both of these EC2 instances i have created and attached instance profiles (see the screenshots below)

image image

I can certainly try launching debug pod with AWS CLI , however what and how should i be verifying if the EC2 has right permissions ? Just any AWS Cli command that is defined in the IAM role ?

Can you elaborate your last comment line try whether the pod can access aws credentials ? how to check that, is what i am looking for

M00nF1sh commented 1 year ago

e.g. you can launch a pod with amazonlinux:2 image on the same node(where the LBC runs), and then exec into that pod, and install awscli inside that pod using yum, and run aws sts get-caller-identity. If it fails, then it means your aws credentials for node is not properly setup.

In most cases, it's due to the hot-limit is not set to 2, but you already explicitly set it to 2. other possible cases is you might have some iptables rules that blocked ec2 metadata for pods. (either by you explicitly or you installed some 3rd party tools did that automatically)

oliviassss commented 1 year ago

@ksingh7, Hi, just to follow up if above solution solved your issue? If so, can we close this issue? Thanks

oliviassss commented 1 year ago

@ksingh7, I'm going to close the issue as for now, please feel free to reach out or reopen if you have any questions, thanks.