apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
2.06k stars 1.1k forks source link

Kubernetes cluster is not created successfully on XCP-ng 8.2 #7309

Closed furyflash777 closed 1 year ago

furyflash777 commented 1 year ago
ISSUE TYPE
COMPONENT NAME
 UI
CLOUDSTACK VERSION
4.17.2
SUMMARY

Unable to start kubernetes cluster

Create Kubernetes cluster Command failed due to Internal Server Error

2023-03-04 04:14:15,359 WARN [c.c.a.d.ParamGenericValidationWorker] (qtp1443435931-22:ctx-e0c381d5 ctx-aa087e41) (logid:c20b12ee) Received unknown parameters for command queryAsyncJobResult. Unknown parameters : projectid 2023-03-04 04:14:15,363 INFO [c.c.v.UserVmManagerImpl] (API-Job-Executor-4:ctx-16629d81 job-12113 ctx-e5e933b4) (logid:ee408dc1) VM cannot be configured to be dynamically scalable if any of the service offering's dynamic scaling property, template's dynamic scaling property or global setting is false 2023-03-04 04:14:15,372 DEBUG [c.c.u.d.T.Transaction] (API-Job-Executor-4:ctx-16629d81 job-12113 ctx-e5e933b4) (logid:ee408dc1) Rolling bac k the transaction: Time = 7 Name = API-Job-Executor-4; called by -TransactionLegacy.rollback:888-TransactionLegacy.removeUpTo:831-Transact ionLegacy.close:655-Transaction.execute:38-UserVmManagerImpl.commitUserVm:4295-UserVmManagerImpl.commitUserVm:4518-UserVmManagerImpl.create VirtualMachine:4155-UserVmManagerImpl.createAdvancedVirtualMachine:3668-NativeMethodAccessorImpl.invoke0:-2-NativeMethodAccessorImpl.invoke :62-DelegatingMethodAccessorImpl.invoke:43-Method.invoke:566 2023-03-04 04:14:15,384 DEBUG [c.c.a.ApiServlet] (qtp1443435931-22:ctx-e0c381d5 ctx-aa087e41) (logid:c20b12ee) ===END=== 188.242.17.99 -- GET jobId=ee408dc1-484c-4533-b23f-54a2579a2016&command=queryAsyncJobResult&response=json&projectid=eec0900f-f554-4aa9-aaf2-88ffd0bb0b98 2023-03-04 04:14:15,392 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-4:ctx-16629d81 job-12113) (logid:ee408dc1) Unexpected excepti on while executing org.apache.cloudstack.api.command.user.kubernetes.cluster.CreateKubernetesClusterCmd java.lang.NullPointerException at com.cloud.vm.UserVmManagerImpl.validateRootDiskResize(UserVmManagerImpl.java:4530) at com.cloud.vm.UserVmManagerImpl$4.doInTransaction(UserVmManagerImpl.java:4343) at com.cloud.vm.UserVmManagerImpl$4.doInTransaction(UserVmManagerImpl.java:4343) at com.cloud.vm.UserVmManagerImpl$4.doInTransaction(UserVmManagerImpl.java:4295) at com.cloud.utils.db.Transaction.execute(Transaction.java:40) at com.cloud.vm.UserVmManagerImpl.commitUserVm(UserVmManagerImpl.java:4295) at com.cloud.vm.UserVmManagerImpl.commitUserVm(UserVmManagerImpl.java:4518) at com.cloud.vm.UserVmManagerImpl.createVirtualMachine(UserVmManagerImpl.java:4155) at com.cloud.vm.UserVmManagerImpl.createAdvancedVirtualMachine(UserVmManagerImpl.java:3668) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) at com.sun.proxy.$Proxy180.createAdvancedVirtualMachine(Unknown Source) at com.cloud.kubernetes.cluster.actionworkers.KubernetesClusterStartWorker.createKubernetesControlNode(KubernetesClusterStartWorker .java:228) at com.cloud.kubernetes.cluster.actionworkers.KubernetesClusterStartWorker.provisionKubernetesClusterControlVm(KubernetesClusterSta rtWorker.java:317) at com.cloud.kubernetes.cluster.actionworkers.KubernetesClusterStartWorker.startKubernetesClusterOnCreate(KubernetesClusterStartWor ker.java:562) at com.cloud.kubernetes.cluster.KubernetesClusterManagerImpl.startKubernetesCluster(KubernetesClusterManagerImpl.java:1143) at org.apache.cloudstack.api.command.user.kubernetes.cluster.CreateKubernetesClusterCmd.execute(CreateKubernetesClusterCmd.java:287 ) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:163) at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:106) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2023-03-04 04:14:15,402 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-4:ctx-16629d81 job-12113) (logid:ee408dc1) Complete async job-12113, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode ":"530"} 2023-03-04 04:14:15,405 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-4:ctx-16629d81 job-12113) (logid:ee408dc1) Publish async job-12113 complete on message bus

STEPS TO REPRODUCE

Login to Cloudstack UI create kubernetes cluster

EXPECTED RESULTS
Kubernetes cluster has been created successfully 
ACTUAL RESULTS
Kubernetes cluster has been created with error
kiranchavala commented 1 year ago

@furyflash777 Could you please provide more details of the environment

is it kvm, vmware or xen hypervisor?

which version of k8s iso did u you register ?

did you select any exisitng isolated network when creating kubernetes cluster

http://docs.cloudstack.apache.org/en/latest/plugins/cloudstack-kubernetes-service.html

shwstppr commented 1 year ago

@furyflash777 Deployment is failing while trying to validate root disk size for cluster node VMs. What root disk size did you provide in the UI? And I hope systemvm template for your zone is in ready state?

furyflash777 commented 1 year ago

@furyflash777 Could you please provide more details of the environment

is it kvm, vmware or xen hypervisor?

which version of k8s iso did u you register ?

did you select any exisitng isolated network when creating kubernetes cluster

http://docs.cloudstack.apache.org/en/latest/plugins/cloudstack-kubernetes-service.html

Hello, XCP-NG 8.2 Build date 2022-02-11. DBV: 0.0.1 Tried different k8 versions 1.20.9-1.24.0 and select existing isolated network. After Internal Server Error cluster has infinite status "Statring", but k8 nodes creation does not start.

Also I tried to put inside new network (without VMs in allocated state) and have the same error and virtual router in stopped state.

furyflash777 commented 1 year ago

@furyflash777 Deployment is failing while trying to validate root disk size for cluster node VMs. What root disk size did you provide in the UI? And I hope systemvm template for your zone is in ready state?

I tried default 8 GB, then 20 GB. Same error. System VM template in the ready status. Also from this template Console proxy, SSVM and routers created successfully.

shwstppr commented 1 year ago

@furyflash777 can you please share output of the following sql query select * from kubernetes_cluster where uuid='UUID_OF_FAILED_CLUSTER'\G Also, can you please check if the cluster's node VMs are deployed with the same system template that is used for SSVM and CPVM in the zone?

furyflash777 commented 1 year ago

@furyflash777 can you please share output of the following sql query select * from kubernetes_cluster where uuid='UUID_OF_FAILED_CLUSTER'\G Also, can you please check if the cluster's node VMs are deployed with the same system template that is used for SSVM and CPVM in the zone?

Hello.

mysql> select * from kubernetes_cluster where uuid='2d5ff96f-e510-4bf7-9fbe-3f492dad5824'\G 1. row id: 22 uuid: 2d5ff96f-e510-4bf7-9fbe-3f492dad5824 name: k8demo description: k8 demo zone_id: 2 kubernetes_version_id: 7 service_offering_id: 14 template_id: 1 network_id: 210 control_node_count: 1 node_count: 1 account_id: 6 domain_id: 2 state: Starting key_pair: demouser cores: 4 memory: 4096 node_root_disk_size: 8 endpoint: created: 2023-03-07 21:37:38 removed: NULL gc: 0 autoscaling_enabled: 0 minsize: NULL maxsize: NULL security_group_id: NULL 1 row in set (0.00 sec)

furyflash777 commented 1 year ago

mysql> select id, name, removed from template_view; +-----+---------------------------------------+---------------------+ | id | name | removed | +-----+---------------------------------------+---------------------+ | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 1 | SystemVM Template (XenServer) | NULL | | 2 | CentOS 5.3(64-bit) no GUI (XenServer) | 2022-12-14 04:38:45 | | 3 | SystemVM Template (KVM) | NULL | | 4 | CentOS 5.5(64-bit) no GUI (KVM) | NULL | | 5 | CentOS 5.6(64-bit) no GUI (XenServer) | NULL | | 6 | CentOS 6.4(64-bit) GUI (Hyperv) | NULL | | 7 | CentOS 5.3(64-bit) no GUI (vSphere) | NULL | | 8 | SystemVM Template (vSphere) | NULL | | 9 | SystemVM Template (HyperV) | NULL | | 10 | SystemVM Template (LXC) | NULL | | 11 | CentOS 7(64-bit) no GUI (LXC) | NULL | | 12 | SystemVM Template (Ovm3) | NULL | | 200 | vmware-tools.iso | NULL | | 201 | xs-tools.iso | NULL | | 202 | Turnkey LAMP | NULL | | 202 | Turnkey LAMP | NULL | | 203 | v1.20.9-Kubernetes-Binaries-ISO | 2023-03-06 22:39:09 | | 204 | v1.21.5-Kubernetes-Binaries-ISO | 2023-03-06 22:39:15 | | 205 | v1.22.6-Kubernetes-Binaries-ISO | 2023-03-06 22:39:23 | | 206 | v1.23.3-Kubernetes-Binaries-ISO | 2023-03-06 22:39:29 | | 207 | v1.24.0-Kubernetes-Binaries-ISO | 2023-03-06 12:50:55 | | 208 | v1.24.0-Kubernetes-Binaries-ISO | 2023-03-06 22:39:36 | | 209 | v1.20.9-Kubernetes-Binaries-ISO | NULL | +-----+---------------------------------------+---------------------+ 50 rows in set (0.00 sec)

weizhouapache commented 1 year ago

@furyflash777 it looks like the "size" of systemvm template is NULL. Can you share the result of mysql query below ?

select id,name,url,size from vm_template where type='SYSTEM';

This seems to be critical issue. cc @DaanHoogland I will have a look

weizhouapache commented 1 year ago

I have checked some environments, all look ok. The systemvm template has correct size in database. kubernetes cluster can be created successfully. Not really critical. sorry @DaanHoogland

furyflash777 commented 1 year ago

I have registered new template with GUI and set as SYSTEM (id 210). From new template k8 cluster deployed successfully.

mysql> select id,name,url,size from vm_template where type='SYSTEM'; +-----+-------------------------------+--------------------------------------------------------------------------------------+------------+ | id | name | url | size | +-----+-------------------------------+--------------------------------------------------------------------------------------+------------+ | 1 | SystemVM Template (XenServer) | http://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-xen.vhd.bz2 | NULL | | 3 | SystemVM Template (KVM) | https://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-kvm.qcow2.bz2 | NULL | | 8 | SystemVM Template (vSphere) | https://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-vmware.ova | NULL | | 9 | SystemVM Template (HyperV) | https://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-hyperv.vhd.zip | NULL | | 10 | SystemVM Template (LXC) | https://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-kvm.qcow2.bz2 | NULL | | 12 | SystemVM Template (Ovm3) | https://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-ovm.raw.bz2 | NULL | | 210 | systemvm-xenserver-4.17.2 | http://download.cloudstack.org/systemvm/4.17/systemvmtemplate-4.17.2-xen.vhd.bz2 | 5242880000 | +-----+-------------------------------+--------------------------------------------------------------------------------------+------------+ 7 rows in set (0.00 sec)

Is it possible to delete old template safely?

shwstppr commented 1 year ago

@furyflash777 you may delete that if it is not used by any of the systemvm or vr. Keep in mind that zone-level or global config - router.template.xenserver may contain that template name if not changed You may have to change the template type first to USER and then delete it from the zones tab in the UI.

weizhouapache commented 1 year ago

I have tried k8s creation on xcpng-82 environements, all looks ok. The size of systemvm template was NULL during zone creation, however, when ssvm is up, it updated the size and state of the template.