giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Test CAPA cluster creation in a different account #1769

Closed alex-dabija closed 1 year ago

alex-dabija commented 1 year ago

Task

Test CAPA cluster creation in a different account.

TODOs

calvix commented 1 year ago

I created a new AWSClusterRoleIdentity named default2 in grizzly which points to account 180547736195 (atm known as gauss workload cluster account)

roles are created as well so testing should be possible

calvix commented 1 year ago

I created a new AWSClusterRoleIdentity named default2 in golem which points to account 180547736195 (atm known as gauss workload cluster account)

zewolfe commented 1 year ago

I’ve created a cross account public cluster named bear . These are the things that I’ve checked for on AWS (using the specified gauss account) and the cluster itself.

Checklist on AWS

Checklist on Cluster

alex-dabija commented 1 year ago

I tried to create a private cluster on golem but the aws-network-topology-operator is saying that AWS can't find an existing subnet when the transit gateway is attached:

1.6710229132236247e+09  INFO    Reconciling     {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest36","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest36", "reconcileID": "6686176c-118c-48b7-9ae2-19166a0f4e64"}
1.6710229133360448e+09  INFO    transitgateway-registrar        Got TransitGateway      {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest36","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest36", "reconcileID": "6686176c-118c-48b7-9ae2-19166a0f4e64", "transitGatewayID": "tgw-034e681b2d0288423"}
1.6710229134554205e+09  ERROR   transitgateway-registrar        Failed to create transit gateway attachments    {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest36","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest36", "reconcileID": "6686176c-118c-48b7-9ae2-19166a0f4e64", "transitGatewayID": "tgw-034e681b2d0288423", "vpcID": "vpc-0e9d440dc6831956a", "error": "operation error EC2: CreateTransitGatewayVpcAttachment, https response error StatusCode: 400, RequestID: 93f6dbfb-b2de-454c-9913-7c93aabeb13f, api error InvalidSubnetID.NotFound: The subnet ID 'subnet-0545a88f6cbcbef1d' does not exist"}
1.6710229134735894e+09  INFO    Done reconciling        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest36","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest36", "reconcileID": "6686176c-118c-48b7-9ae2-19166a0f4e64"}
1.6710229134736302e+09  ERROR   Reconciler error        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest36","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest36", "reconcileID": "6686176c-118c-48b7-9ae2-19166a0f4e64", "error": "operation error EC2: CreateTransitGatewayVpcAttachment, https response error StatusCode: 400, RequestID: 93f6dbfb-b2de-454c-9913-7c93aabeb13f, api error InvalidSubnetID.NotFound: The subnet ID 'subnet-0545a88f6cbcbef1d' does not exist"}

The cluster was created with the following config:

---
apiVersion: v1
data:
  values: |
    aws:
      region: eu-west-2
      awsClusterRole: default2
    bastion:
      enabled: false
    proxy:
      enabled: true
      http_proxy: "http://internal-a1c90e5331e124481a14fb7ad80ae8eb-1778512673.eu-west-2.elb.amazonaws.com:4000"
      https_proxy: "http://internal-a1c90e5331e124481a14fb7ad80ae8eb-1778512673.eu-west-2.elb.amazonaws.com:4000"
      no_proxy: "test-domain.com"
    clusterName: alextest36
    controlPlane:
      replicas: 1
    machinePools:
    - instanceType: m5.xlarge
      maxSize: 10
      minSize: 3
      name: machine-pool0
      rootVolumeSizeGB: 300
      availabilityZones:
      - eu-west-2a
      - eu-west-2b
      - eu-west-2c
    network:
      vpcCIDR: 10.20.0.0/16
      topologyMode: GiantSwarmManaged
      availabilityZoneUsageLimit: 3
      vpcMode: private
      apiMode: private
      dnsMode: private
      subnets:
      - cidrBlock: 10.20.0.0/18
      - cidrBlock: 10.20.64.0/18
      - cidrBlock: 10.20.128.0/18
    organization: giantswarm
kind: ConfigMap
metadata:
  creationTimestamp: null
  labels:
    giantswarm.io/cluster: alextest36
  name: alextest36-userconfig
  namespace: org-giantswarm
---
apiVersion: application.giantswarm.io/v1alpha1
kind: App
metadata:
  labels:
    app-operator.giantswarm.io/version: 0.0.0
  name: alextest36
  namespace: org-giantswarm
spec:
  catalog: cluster
  config:
    configMap:
      name: ""
      namespace: ""
    secret:
      name: ""
      namespace: ""
  kubeConfig:
    context:
      name: ""
    inCluster: true
    secret:
      name: ""
      namespace: ""
  name: cluster-aws
  namespace: org-giantswarm
  userConfig:
    configMap:
      name: alextest36-userconfig
      namespace: org-giantswarm
  version: 0.20.2
---
apiVersion: v1
data:
  values: |
    clusterName: alextest36
    organization: giantswarm
kind: ConfigMap
metadata:
  creationTimestamp: null
  labels:
    giantswarm.io/cluster: alextest36
  name: alextest36-default-apps-userconfig
  namespace: org-giantswarm
---
apiVersion: application.giantswarm.io/v1alpha1
kind: App
metadata:
  labels:
    app-operator.giantswarm.io/version: 0.0.0
    giantswarm.io/cluster: alextest36
    giantswarm.io/managed-by: cluster
  name: alextest36-default-apps
  namespace: org-giantswarm
spec:
  catalog: cluster
  config:
    configMap:
      name: alextest36-cluster-values
      namespace: org-giantswarm
    secret:
      name: ""
      namespace: ""
  kubeConfig:
    context:
      name: ""
    inCluster: true
    secret:
      name: ""
      namespace: ""
  name: default-apps-aws
  namespace: org-giantswarm
  userConfig:
    configMap:
      name: alextest36-default-apps-userconfig
      namespace: org-giantswarm
  version: 0.12.3
alex-dabija commented 1 year ago

The subnet subnet-0545a88f6cbcbef1d exists: image

alex-dabija commented 1 year ago

All the operations in the operator are done from the perspective of the management cluster. There are only 2 places where the GetAWSRoleIdentity function is called:

The transit gateway attachment needs to be executed from the workload cluster's perspective.

alex-dabija commented 1 year ago

The transit gatway and the prefix list needs to be shared with the workload cluster account before the cluster is created. I manually attached the transit gateway and updated the route tables. The cluster was able to start:

 #: kubectl get pods -A
NAMESPACE     NAME                                                                  READY   STATUS      RESTARTS        AGE
giantswarm    chart-operator-79d7b7567b-4rxd2                                       1/1     Running     0               8m38s
kube-system   aws-pod-identity-webhook-app-66fd9f9b55-6zwk8                         1/1     Running     0               9m45s
kube-system   aws-pod-identity-webhook-app-66fd9f9b55-jf4wv                         1/1     Running     0               9m45s
kube-system   aws-pod-identity-webhook-restarter-27850445-hvkd5                     0/1     Completed   0               99s
kube-system   capi-node-labeler-2sgf9                                               1/1     Running     0               16m
kube-system   capi-node-labeler-452jv                                               1/1     Running     0               16m
kube-system   capi-node-labeler-82f2n                                               1/1     Running     0               16m
kube-system   capi-node-labeler-fcnvb                                               1/1     Running     0               16m
kube-system   cert-exporter-daemonset-7cwnr                                         1/1     Running     0               12m
kube-system   cert-exporter-daemonset-9mvt9                                         1/1     Running     0               12m
kube-system   cert-exporter-daemonset-tgpq2                                         1/1     Running     0               12m
kube-system   cert-exporter-daemonset-whnnb                                         1/1     Running     0               12m
kube-system   cert-exporter-deployment-85c658c656-kvppj                             1/1     Running     0               12m
kube-system   cert-manager-cainjector-6dc9c79bfd-swh79                              1/1     Running     0               10m
kube-system   cert-manager-controller-7b7c4c77c4-5hqcp                              1/1     Running     0               10m
kube-system   cert-manager-webhook-6bf4c564bb-m62b4                                 1/1     Running     0               10m
kube-system   cert-manager-webhook-6bf4c564bb-p6lwr                                 1/1     Running     0               10m
kube-system   cilium-ctwdg                                                          1/1     Running     0               14m
kube-system   cilium-jfqsb                                                          1/1     Running     0               14m
kube-system   cilium-kl7l4                                                          1/1     Running     0               14m
kube-system   cilium-operator-58bcdb44cb-vg2mf                                      1/1     Running     0               14m
kube-system   cilium-operator-58bcdb44cb-wgtlq                                      1/1     Running     0               14m
kube-system   cilium-wkkfr                                                          1/1     Running     0               14m
kube-system   coredns-controlplane-564bffc48d-w4hwz                                 1/1     Running     0               12m
kube-system   coredns-workers-8666c764cd-fvfvb                                      1/1     Running     0               12m
kube-system   coredns-workers-8666c764cd-p8r6k                                      1/1     Running     0               12m
kube-system   ebs-csi-controller-67c4cf5496-q5txj                                   5/5     Running     0               11m
kube-system   ebs-csi-node-9l6vr                                                    3/3     Running     0               11m
kube-system   ebs-csi-node-hc5sz                                                    3/3     Running     0               11m
kube-system   ebs-csi-node-qpjtl                                                    3/3     Running     0               11m
kube-system   etcd-ip-10-20-101-216.eu-west-2.compute.internal                      1/1     Running     0               17m
kube-system   external-dns-76b8fb8586-4dj2m                                         2/2     Running     2 (7m19s ago)   12m
kube-system   hubble-relay-79bfdd4c6c-brbgx                                         1/1     Running     0               14m
kube-system   kiam-agent-f6tsl                                                      1/1     Running     2 (7m33s ago)   7m36s
kube-system   kiam-agent-tkm59                                                      1/1     Running     2 (7m33s ago)   7m36s
kube-system   kiam-agent-zbzkd                                                      1/1     Running     2 (7m33s ago)   7m36s
kube-system   kiam-namespace-annotation-kube-system-5l9b6                           0/1     Completed   0               9m53s
kube-system   kiam-server-fpvsp                                                     1/1     Running     0               7m37s
kube-system   kube-apiserver-ip-10-20-101-216.eu-west-2.compute.internal            1/1     Running     2 (17m ago)     17m
kube-system   kube-controller-manager-ip-10-20-101-216.eu-west-2.compute.internal   1/1     Running     1 (17m ago)     17m
kube-system   kube-scheduler-ip-10-20-101-216.eu-west-2.compute.internal            1/1     Running     1 (17m ago)     17m
kube-system   kube-state-metrics-6b89676dbf-bqsrq                                   1/1     Running     0               10m
kube-system   metrics-server-7cdb8d8cd8-mjfbs                                       1/1     Running     0               12m
kube-system   metrics-server-7cdb8d8cd8-xcfm2                                       1/1     Running     0               12m
kube-system   net-exporter-2vcfh                                                    1/1     Running     0               12m
kube-system   net-exporter-6h9nm                                                    1/1     Running     0               12m
kube-system   net-exporter-cj4wn                                                    1/1     Running     0               12m
kube-system   net-exporter-j2sq5                                                    1/1     Running     0               12m
kube-system   node-exporter-v1-3-1-4n7vd                                            1/1     Running     0               14m
kube-system   node-exporter-v1-3-1-lh9p9                                            1/1     Running     0               14m
kube-system   node-exporter-v1-3-1-tbg6s                                            1/1     Running     0               14m
kube-system   node-exporter-v1-3-1-wkvgh                                            1/1     Running     0               14m
kube-system   prometheus-operator-app-operator-f6b45b859-qfqvz                      1/1     Running     0               11m
kube-system   prometheus-prometheus-agent-0                                         2/2     Running     0               11m
kube-system   vertical-pod-autoscaler-admission-controller-556558c8d9-ffc5d         1/1     Running     0               11m
kube-system   vertical-pod-autoscaler-admission-controller-556558c8d9-z6spl         1/1     Running     0               11m
kube-system   vertical-pod-autoscaler-recommender-67b5d54d5f-lgvb6                  1/1     Running     0               11m
kube-system   vertical-pod-autoscaler-updater-7bdddf5b-jgk8t                        1/1     Running     0               11m
alex-dabija commented 1 year ago

Blocked until https://github.com/giantswarm/roadmap/issues/1801 is fixed.

calvix commented 1 year ago

instruction on how to create a role is here https://github.com/giantswarm/giantswarm-aws-account-prerequisites/tree/master/capa-controller-role - use INSTALLATION_NAME goat

we should create the role in the AWS account 180547736195 and then just create another AWSClusterRoleIdentity, using the default CR in the goat as a reference .

mnitchev commented 1 year ago

Blocking again on https://github.com/giantswarm/roadmap/issues/1801 as the transit gateway needs to also be shared with the account automatically

fiunchinho commented 1 year ago

can this one be moved out of the blocked column into the sprint backlog now that https://github.com/giantswarm/roadmap/issues/1801 is closed?

alex-dabija commented 1 year ago

thunder / dev00 was recreated in a separate account.