Hypershift does not support non-default CNI on the Hosted Cluster

sujeet-kr commented 6 months ago

I tried provisioning a Hosted cluster with Hypershift on AWS using --network-type to Other using,

hypershift create cluster aws \
  --name $CLUSTER_NAME \
  --node-pool-replicas=3 \
  --base-domain $BASE_DOMAIN \
  --pull-secret $PULL_SECRET \
  --aws-creds $AWS_CREDS \
  --region $REGION \
  --network-type Other \
  --generate-ssh

The command did not error out and the nodes for the hosted cluster got created, but on checking kubectl get --namespace clusters hostedclusters noticed the error - ValidConfiguration condition is false: service type OVNSbDb not found for the hosted cluster.

Also tried creating the Hosted Cluster using the yaml spec and setting HostedCluster.spec.networking.networkType to Other got the same error as with the cli.

Hosting cluster OCP version - 4.15.4 Hosted cluster tried - 4.15.9

NAME                  VERSION   KUBECONFIG                             PROGRESS    AVAILABLE   PROGRESSING   MESSAGE
sk-hcp-hosted55                                                    Partial     False       False         ValidConfiguration condition is false: service type OVNSbDb not found

csrwng commented 6 months ago

@sujeet-kr you may want to update your hypershift CLI. It looks like you are using a very old version. Try using one you build from main.

sujeet-kr commented 6 months ago

Thank you for your response @csrwng . The above issue was experienced with the cli built from main. I used the steps from https://hypershift-docs.netlify.app/getting-started/

csrwng commented 6 months ago

service type OVNSbDb not found --> should not be in the latest CLI Can you paste the yaml of the HostedCluster created by the CLI?

sujeet-kr commented 6 months ago

Here is the yaml -

kind: Namespace
metadata:
  creationTimestamp: null
  name: clusters
spec: {}
status: {}
---
apiVersion: v1
data:
  .dockerconfigjson: redacted
kind: Secret
metadata:
  creationTimestamp: null
  labels:
    hypershift.openshift.io/safe-to-delete-with-cluster: "true"
  name: sujeet-hcp-redacted-pull-secret
  namespace: clusters
---
apiVersion: v1
data:
  key: redacted
kind: Secret
metadata:
  creationTimestamp: null
  labels:
    hypershift.openshift.io/safe-to-delete-with-cluster: "true"
  name: sujeet-hcp-redacted-etcd-encryption-key
  namespace: clusters
type: Opaque
---
apiVersion: v1
data:
  id_rsa: redacted
  id_rsa.pub: redacted
kind: Secret
metadata:
  creationTimestamp: null
  labels:
    hypershift.openshift.io/safe-to-delete-with-cluster: "true"
  name: sujeet-hcp-redacted-ssh-key
  namespace: clusters
---
apiVersion: hypershift.openshift.io/v1beta1
kind: HostedCluster
metadata:
  creationTimestamp: null
  name: sujeet-hcp-redacted
  namespace: clusters
spec:
  autoscaling: {}
  configuration: {}
  controllerAvailabilityPolicy: SingleReplica
  dns:
    baseDomain: redacted
    privateZoneID: redacted
    publicZoneID: redacted
  etcd:
    managed:
      storage:
        persistentVolume:
          size: 8Gi
          storageClassName: gp3-csi
        type: PersistentVolume
    managementType: Managed
  fips: false
  infraID: sujeet-hcp-redacted-4l4wm
  issuerURL: redacted
  networking:
    clusterNetwork:
    - cidr: 10.128.0.0/14
    machineNetwork:
    - cidr: 10.0.0.0/16
    networkType: Other
    serviceNetwork:
    - cidr: 172.30.0.0/16
  olmCatalogPlacement: management
  platform:
    aws:
      cloudProviderConfig:
        subnet:
          id: redacted
        vpc: redacted
        zone: us-west-2a
      endpointAccess: Public
      multiArch: false
      region: us-west-2
      rolesRef:
        controlPlaneOperatorARN: redacted
        imageRegistryARN: redacted
        ingressARN: redacted
        kubeCloudControllerARN: redacted
        networkARN: redacted
        nodePoolManagementARN: redacted
        storageARN: redacted
    type: AWS
  pullSecret:
    name: sujeet-hcp-redacted-pull-secret
  release:
    image: ""
  secretEncryption:
    aescbc:
      activeKey:
        name: sujeet-hcp-redacted-etcd-encryption-key
    type: aescbc
  services:
  - service: APIServer
    servicePublishingStrategy:
      type: LoadBalancer
  - service: OAuthServer
    servicePublishingStrategy:
      type: Route
  - service: Konnectivity
    servicePublishingStrategy:
      type: Route
  - service: Ignition
    servicePublishingStrategy:
      type: Route
  sshKey:
    name: sujeet-hcp-redacted-ssh-key
status:
  controlPlaneEndpoint:
    host: ""
    port: 0
---
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
  creationTimestamp: null
  name: sujeet-hcp-redacted-us-west-2a
  namespace: clusters
spec:
  arch: amd64
  clusterName: sujeet-hcp-redacted
  management:
    autoRepair: false
    upgradeType: Replace
  nodeDrainTimeout: 0s
  platform:
    aws:
      instanceProfile: sujeet-hcp-redacted-4l4wm-worker
      instanceType: t3.xlarge
      rootVolume:
        size: 120
        type: gp3
      subnet:
        id: redacted
    type: AWS
  release:
    image: ""
  replicas: 3
status:
  replicas: 0
---

sujeet-kr commented 6 months ago

Don't think this would be useful but here is the version of Hypershift I built and used - hypershift version openshift/hypershift: e446c102eaae97f592e2fb309d325375d46b766a. Latest supported OCP: 4.16.0

csrwng commented 6 months ago

@sujeet-kr can you include the version of the hypershift operator on the cluster? (should be first line in log)

sujeet-kr commented 6 months ago

Had to recreate the management cluster. Here is the log with the version -

{"level":"info","ts":"2024-04-29T15:16:04Z","logger":"setup","msg":"Starting hypershift-operator-manager","version":"openshift/hypershift: f2c300f678e2a6b4bca7eaf35dcde4b204f7217e. Latest supported OCP: 4.16.0"}

{"level":"info","ts":"2024-04-29T15:16:09Z","logger":"setup","msg":"using hosted control plane operator image","operator-image":"quay.io/hypershift/hypershift-operator@sha256:19d6d0494056092fb22c9442252681a3dfcb7681c7f7fafa6c95e99fa3346b64"}

openshift-bot commented 3 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 2 months ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 1 month ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci[bot] commented 1 month ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/hypershift/issues/3927#issuecomment-2380298687): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

openshift / hypershift

Hypershift does not support non-default CNI on the Hosted Cluster #3927