oracle / cluster-api-provider-oci

Kubernetes Cluster API Provider for Oracle Cloud Infrastructure
https://oracle.github.io/cluster-api-provider-oci/
Apache License 2.0
39 stars 22 forks source link

Existing cluster cannot be adopted #305

Closed alam0rt closed 1 year ago

alam0rt commented 1 year ago

What happened:

After recreating my local kind cluster + installing CAPI to it along with the cluster manifests (the OKE cluster already exists, I am just recreating kind cluster), the CAPOCI provider was unable to use the existing OKE cluster

namespace="default" name="banshee" reconcileID=e7c4e3ec-6ca7-4200-a0bb-e7ac4e538ef5 OCIManagedCluster="default/banshee"
E0723 06:46:40.994077       1 controller.go:329] "Reconciler error" err=<
        failed to reconcile OCI Managed Control Plane default/banshee: Error returned by ContainerEngine Service. Http Status Code: 400. Error Code: LimitExceeded. Opc request id: dba88a18f8184599c46a93f633c442a5/CA01F8F1ED78ED2575932E866CA322E2/883BB12D2BDAA9905997455E4CDE0DD3. Message: The cluster limit for this tenancy has been exceeded.
        Operation Name: CreateCluster
        Timestamp: 2023-07-23 06:46:40 +0000 GMT
        Client Version: Oracle-GoSDK/65.40.1
        Request Endpoint: POST https://containerengine.ap-melbourne-1.oci.oraclecloud.com/20180222/clusters
        Troubleshooting Tips: See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_400__400_limitexceeded for more information about resolving this error.
        Also see https://docs.oracle.com/iaas/api/#/en/containerengine/20180222/Cluster/CreateCluster for details on this operation's requirements.
        To get more info on the failing request, you can set OCI_GO_SDK_DEBUG env var to info or higher level to log the request/response details.
        If you are unable to resolve this ContainerEngine issue, please contact Oracle support and provide them this full error message.
 > controller="ocimanagedcontrolplane" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OCIManagedControlPlane" OCIManagedControlPlane="default/banshee" namespace="default" name="banshee" reconcileID=e7c4e3ec-6ca7-4200-a0bb-e7ac4e538ef5

What you expected to happen: I expected that CAPOCI would see that an existing OKE cluster was already present and would adopt it.

How to reproduce it (as minimally and precisely as possible):

kind create cluster --config kind.yaml || true
kind get kubeconfig --name capi > /tmp/kind.yaml
export KUBECONFIG=/tmp/kind.yaml

clusterctl init --infrastructure oci --wait-providers --addon helm --config ./clusterctl.yaml
clusterctl generate cluster banshee --from template.yaml --config ./clusterctl.yaml > cluster.yaml
kubectl apply -f cluster.yaml --wait

clusterctl get kubeconfig banshee > ./banshee.kubeconfig
kubectl cluster-info --kubeconfig ./banshee.kubeconfig

Anything else we need to know?:

using the https://github.com/oracle/cluster-api-provider-oci/releases/download/v0.12.0/cluster-template-managed.yaml template

Environment:

shyamradhakrishnan commented 1 year ago

@alam0rt CAPOCI cannot adopt an existing cluster, the functionality does not exist. May I please know the use case? In general Cluster API, be any provider, does not have facility to adopt a cluster. You can use an existing cluster as a management cluster, but you cannot adopt an existing cluster.

alam0rt commented 1 year ago

Hey @shyamradhakrishnan, in order to upgrade/manage the management cluster, I was hoping that I could provision a new kind cluster + clusterctl init + apply the existing management manifests into it and have it reconcile the existing OKE cluster.

shyamradhakrishnan commented 1 year ago

@alam0rt ideally you should use Cluster API pivot procedure https://cluster-api.sigs.k8s.io/clusterctl/commands/move.html to make the management cluster(which I am assume is an OKE cluster) self sustaining. Unfortunately, since you have already deleted the kind cluster, there is nothing that can be done at this point. As I said, reconciling existing objects is not something Cluster API has in-built, there are some conversations happening in the community but nothing concrete. You can create an OCIManagedCluster/OCIManagedControlPlane object, rather than a empty one as you are doing right now, and put it all the necessary OCIDs etc after comparing it with an existing Custom resource, but it would be a time consuming process. The simplest for now will be to delete and recreate the setup if at all that is possible.

alam0rt commented 1 year ago

Ah, thanks for the pointer. That's no issue at all, easy to recreate :)