kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8.11k stars 3.98k forks source link

`magnum`: No suitable endpoint could be found in the service catalog. #6987

Closed yellowhat closed 4 months ago

yellowhat commented 5 months ago

Which component are you using?: magnum (cluster-autoscaler)

What version of the component are you using?: 1.28.0

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.2+k3s1

What environment is this in?: ovh (openstack)

What did you expect to happen?:

cluster-autoscaler is able to connect and authenticate to OpenStack (nova)

What happened instead?:

Hi, I am using the following cloud-config file:

[Global]
auth-url=https://auth.cloud.ovh.net/v3/
username=user-xxx
password=yyy
tenant-id=zzz
domain-name=default

the same credentials, options, I use to authenticate via the openstack terraform provider

...
        - name: cluster-autoscaler
          image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.30.1
          imagePullPolicy: Always
          command:
            - ./cluster-autoscaler
            - --alsologtostderr
            - --cluster-name=cluster.local
            - --cloud-config=/config/cloud-config
            - --cloud-provider=magnum
            - --nodes=1:10:DefaultNodeGroup
            - --v=2
          volumeMounts:
            - name: cloud-config
              mountPath: /config
              readOnly: true
...

but I get the following error on boot:

[leaderelection.go:250] attempting to acquire leader lease kube-system/cluster-autoscaler...
[leaderelection.go:260] successfully acquired lease kube-system/cluster-autoscaler
[request.go:697] Waited for 1.198450606s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/apps/v1/statefulsets?limit=500&resourceVersion=0
[request.go:697] Waited for 2.397401073s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/api/v1/nodes?limit=500&resourceVersion=0
[cloud_provider_builder.go:29] Building magnum cloud provider.
[magnum_cloud_provider.go:341] Failed to create magnum manager: could not create container-infra client: No suitable endpoint could be found in the service catalog.

If I inject OS_* env var:

...
          env:
            - name: OS_AUTH_URL
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: auth-url
            - name: OS_USERNAME
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: username
            - name: OS_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: password
            - name: OS_DOMAIN_NAME
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: domain-name
            - name: OS_TENANT_ID
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: tenant-id
            - name: OS_REGION_NAME
              valueFrom:
                secretKeyRef:
                  name: foo
                  key: region
...

I get:

[leaderelection.go:250] attempting to acquire leader lease kube-system/cluster-autoscaler...
[leaderelection.go:260] successfully acquired lease kube-system/cluster-autoscaler
[request.go:697] Waited for 1.198664763s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0
[request.go:697] Waited for 2.198517653s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/apps/v1/daemonsets?limit=500&resourceVersion=0
[cloud_provider_builder.go:29] Building magnum cloud provider.
[magnum_cloud_provider.go:341] Failed to create magnum manager: could not create provider client: could not authenticate client: You must provide a password to authenticate

Any suggestions?

Thanks

Shubham82 commented 4 months ago

/area provider/magnum /area cluster-autoscaler

Shubham82 commented 4 months ago

cc @tghartland

tghartland commented 4 months ago

Hi @yellowhat,

I'm not familiar with the OVH openstack environment, but they do have their own cloud provider for the cluster autoscaler, separate to the openstack magnum provider. Try giving that a go, it might just work out of the box.

yellowhat commented 4 months ago

@tghartland Thanks for the reply.

I am interested in using OVH as a generic openstack, in order to understand if the magnum provider can be easily used also on other providers.

Thanks for the pointer, I will have a look at the OVH provider.

In the meantime do you have any suggestion on how can I gather more information?

yellowhat commented 4 months ago

Also: The cluster autoscaler for OVHclud scales worker nodes within any OVHcloud Kubernetes cluster's node pool..

I am not using the OVH managed kubernetes cluster, I would like to manually bootstrap a kubernetes cluster on OVH

tghartland commented 4 months ago

Magnum is openstack's own API for managed kubernetes clusters. If you're not using magnum to create the cluster then there's nothing that the autoscaler (magnum provider) can do, as it interacts with those API resources.

From the error message No suitable endpoint could be found in the service catalog it sounds like magnum isn't even an option, with OVH providing their own solution instead.

yellowhat commented 4 months ago

Ahhh. I had that feeling. But I was hopping that magnum could downgrade itself to directly interface with nova.

Is there another provider that can interact with nova?

tghartland commented 4 months ago

Not directly with nova, as the task of installing kubernetes on the new VMs would then have to be done by the autoscaler, which is a lot of extra work.

It might be possible with a combination of ClusterAPI for openstack and the autoscaler provider for ClusterAPI, but you'd have to bootstrap it out of an existing cluster as it uses kubernetes custom resources to manage the cloud resources. I will leave that for you to investigate if you want to try.

yellowhat commented 4 months ago

Thanks for the pointer