Closed Stringls closed 1 year ago
Hi @Stringls , from the log above, I believe the CP node has been up and initialized successfully. Could you confirm if you can access CP service of the external cluster from your mgmt cluster? If the answer is yes, it would help diagnose the problem if you could share logs of CAPI controllers.
@Stringls Thanks for reporting! May i ask you to confirm the CP service type? Currently the LoadBalancer
is required in external Virtink cluster, you can refer to this document for more details https://github.com/smartxworks/cluster-api-provider-virtink/blob/main/docs/external-cluster.md
@fengye87 @carezkh Thanks for the quick response. I use ClusterIP
as I described in the issue I opened earlier and specify controlPlaneEndpoint
that is IPv4 of the LB that acts as KubeAPI-server
, but as I understand I have to use LoadBalancer
. When I use LoadBalancer service type it creates a new service in the external cluster.
The problem is: I use Hetzner cloud and when I create LoadBalancer service it tries to create a LB in Hetzner cloud, but it needs some configuration that I have to put in annotations:
section of the service, as I understand I cannot specify any annotations in VirtinkCluster
CRD for controlPlaneServiceType
. On the other side, I already have an existing LoadBalancer service with external IP but it's not pointed on the API server.
So right now I have only one service with external IP
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.110.119.214 ************** 80:32190/TCP,443:30871/TCP 19h
ingress-nginx-controller-admission ClusterIP 10.102.136.251 <none> 443/TCP 19h
@Stringls I want to make sure I understand this correctly. The LB service created by VirtinkCluster controller in Hetzner cloud will not be assigned an external IP correctly, because the LB service does not have annotations
required by Hetzner cloud.
Currently, you can not specify any annotations
in VirtinkCluster
CP service, the supports are on the road! As a workaround, can you try to append the annotations
to the LB service on Hetzner cloud manually?
@carezkh I have an update on the issue. I've added a field controlPlaneServiceType: LoadBalancer
. It deploys the service on the external cluster, but I get the error about requirements of the annotations (Would be able to send it later).
When I manually add the below annotations to the service everything is setup properly
annotations:
load-balancer.hetzner.cloud/location: fsn1
load-balancer.hetzner.cloud/use-private-ip: "false"
load-balancer.hetzner.cloud/ipv6-disabled: "true"
load-balancer.hetzner.cloud/disable-private-ingress: **"true"**
LB in the Hetzner cloud is up and running, the MD is pointed as a target server and virtink cluster is initialized.
It's not related to the issue, but when I tried to install Calico on it, the cluster was unable to an image from the public registry registry1.docker.io
. The error is about DNS name resolution. I do not have logs of it rn, but I hope you get the problem.
@Stringls Are you using the default Virtink VM rootfs image smartxworks/capch-rootfs-1.24.0
?
@carezkh Yes, I use smartxworks/capch-rootfs-1.24.0
smartxworks/capch-kernel-5.15.12
for both CP and MD
@Stringls Sorry for late to reply! May i ask you to confirm that the pod subnet and service subnet of the nested cluster don't overlap with the host cluster‘s pod subnet, service subnet or physical subnet?
The host cluster here refers to the external Virtink cluster, and if the service subnet of nested cluster overlaps with that of host cluster, the DNS may not function in nested cluster.
[update] Actually we have a handy tool knest to build nested K8S cluster based on Virtink and Cluster API provider. Please refer to the project for more usage guides and known issuses, and feel free to give us feedback by opening issues.
@carezkh Hi! It was actually a problem. Basically it's setup like this. I guess I've got this overlapping problem because I deploy CAPI workload cluster onto CAPI workload cluster.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
I set services.cidrBlocks
to 10.98.0.0/16
:) . Could you please give any tips which CIDR block for services subnet is better to use?
One small thing that's probably not related to this issue. When I delete a virtink cluster on a host cluster from the mgmt cluster, the virtink cluster cannot be deleted completely, it's just stuck in a Deleting
loop and I believe it happens because the LB service is being deleted with Pods simultaneously. So is it possible to set a priority of deleting resources of virtink cluster?
To summarize, I've got a working cluster, but I need manually to:
annotations
in virtink
LB service on the host clusterMachine
in the mgmt clusterCurrently, you can not specify any annotations in VirtinkCluster CP service, the supports are on the road!
May I ask you when annotations
support in VirtinkCluster
CP service will be added? Could you please share how you deploy a LB service on a host cluster, I would be glad to contribute to speed it up.
Thanks!
@Stringls There are 3 private IPv4 subnets: 10.0.0.0/8 (in class A)
, 172.16.0.0/12 (in class B)
and 192.168.0.0/16 (in class C)
, you can choose one or the subnet of them for pod/service subnet, just avoid overlapping.
How did you delete the Virtink cluster, by command kubectl delete -f cluster-template-xx.yaml
? If so, the Virtink cluster can not be deleted successfully. Currently, you should delete cluster.v1beta1.cluster.x-k8s.io (not Virtink cluster) first by using command kubectl delete cluster.v1beta1.cluster.x-k8s.io <cluster-name>
, leave the cluster related resources for controllers to delete.
To support annotations
in VirtinkCluster
CP service, you should update the field of the VirtinkClusterSpec, and update the function buildControlPlaneService, some manifests and documents should be updated at the same time.
[update]: You can try commit https://github.com/smartxworks/cluster-api-provider-virtink/pull/35, use command skaffold run
to deploy the controllers on your management cluster, refer to the document installing skaffold to install this tool.
@carezkh Thanks!
I was deleting it by command kubectl delete -f cluster-template.yaml
. It works if use kubectl delete cluster <cluster-name>
command.
One more question if you don't mind: Is it possible to expose virtink cluster to the internet? I'd like to do this with Ingress NGINX
@Stringls Do you mean exposing the nested Virtink cluster CP Service to the outside world by using Ingress instead of LoadBalancer
or NodePort
Service?
Currently, there is no support to use Ingress as CP Service, we will consider implementing it. Any patches from customers will be appreciated!
@carezkh Sorry, my question wasn't clear enough.
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 192.168.0.1 <none> 443/TCP 16h
ingress-nginx ingress-nginx-controller LoadBalancer 192.168.192.250 <pending> 80:30680/TCP,443:31654/TCP 16h
ingress-nginx ingress-nginx-controller-admission ClusterIP 192.168.198.209 <none> 443/TCP 16h
kube-system kube-dns ClusterIP 192.168.0.10 <none> 53/UDP,53/TCP,9153/TCP 16h
When I try to deploy Ingress NGINX helm chart onto virtink cluster the LB service stays in the pending state that is okay I guess. So the question is:
^ That's actually my main point of using virtink clusters :)
@Stringls It seems there is no LoadBalancer
Service controller in your nested K8S cluster (the cluster you created on host Virtink cluster by Cluster API provider, there are some mistakes in the previous description), so the LoadBalancer
Service in your nested K8S cluster will never be assigned any extenal IP.
You can try MetalLB as LB Service controller in your nested K8S cluster, but I don't recommend to use LoadBalancer
ingress-nginx-controller Service here! It's because if the host Virtink cluster does not use bridge
mode CNI, the external IP assigned to the ingress-nginx-controller Service in your nested K8S cluster may not be accessible in host Virtink cluster. The bridge
mode CNI here refers to Kube-OVN, Everoute and etc, not included Calico (works with BGP protocol). And if you can not access the nested K8S cluster's LoadBalancer
ingress-nginx-controller Service in your host Virtink cluster, you can not proxy it to the outside world.
Instead, it's recommended to use NodePort
ingress-nginx-controller Service in your nested K8S cluster, refer to bare-metal-clusters for more details. Now, you can access this ingress-nginx-controller Service in the host Virtink cluster through <node-ip>:<node-port>
, the node-ip
here is one of the nested K8S cluster node's IP, the node is our Virtink VM, and the VM's IP came from VM pod running in your host Virtink Cluster.
Now you can proxy the ingress-nginx-controller Service in the nested K8S cluster to the outside world by using LoadBalancer
Service in your host Virtink cluster, with selector
to VM pod's label and targetPort
to the <node-port>
above.
@carezkh I apologize for late reply and thank you for time.
I took a look at your suggestion and tried to implement it. The proxy seems to be working and I can access apps by referring to its NodePort
as a targertPort
in LoadBalancer
that is in a host Virtink cluster.
Is there a way to automate the above process: with checking what NodePort is, checking on which VM pod's application has been deployed, checking IP of this pod? I'm wondering how to implement all the above in a CI pipeline :)
Update: Could I use here kube proxy
?
@Stringls Maybe you can achieve it by a script, and some tips are:
NodePort
Service, instead of letting it be allocated by K8S.cluster.x-k8s.io/cluster-name: <cluster-name>
in host Virtink cluster. The cluster-name
here is the name of cluster.v1beta1.cluster.x-k8s.io
that you created, and the LB Service will choose VM Pods as its endpoints.Does the kube proxy
here refer to command kubectl proxy
or component kube-proxy
?
@carezkh I am so sorry for answering late, I am not working on it right now :(
I meant kubectl proxy
. If it's possible to avoid creating a new LB for exposing apps I deploy onto Virtink cluster, it would be great.
create LB Service with selector cluster.x-k8s.io/cluster-name:
in host Virtink cluster. The cluster-name here is the name of cluster.v1beta1.cluster.x-k8s.io that you created, and the LB Service will choose VM Pods as its endpoints.
Perhaps I do something wrong, but if I set this selector then it's not working stably. I believe it's because cluster.x-k8s.io/cluster-name:
is present on all machines and LB is doing round-robin
, while my application is deployed on specific VM Pod, so it gets timeout error
.
I tested it with logging to dnsutils
pod that created in the same namespace along with Virtink pods and curl
ing <node-ip>:<node-port>
. When I deploy an app to one node and continuously curling this node:port then everything is okay. That's why I used selector on one of the nodes
@Stringls Of course you can use the command kubectl proxy
to proxy traffic of specific VM node to outside world, rather than using LoadBalancer
Service in host Virtink cluster.
When I deploy an app to one node and continuously curling this node:port then everything is okay.
It looks like you configured the field externalTrafficPolicy
or internalTrafficPolicy
of the NodePort
Service in nested K8S cluster to Local
?
@carezkh Cool, thanks!
It looks like you configured the field externalTrafficPolicy or internalTrafficPolicy of the NodePort Service in nested K8S cluster to Local ?
I cannot be sure that we've used Local
for externalTrafficPolicy
, but when we fixed the issue with networking it seems to be fine.
I'm gonna close the issue. Thank you @fengye87 @carezkh so much for help and what you are doing, that's amazing project!
/kind bug
What I am trying to do and what is happenning
Hi! I deploy an external virtink cluster on BM Hetzner cluster (baremetal node with KVM) from GKE mgmt cluster.
The service
virtink
is created. The CP VM is created on the external cluster with the below logs.Nothing else happens after
cloud-init
script finishes.CAPI logs
Virtink controller logs
What I expect to happen
A virtink cluster to be created
Env
cluster-api-viritn-version : latest kubernetes version: