kubernetes-retired / frakti

The hypervisor-based container runtime for Kubernetes.
Apache License 2.0
671 stars 115 forks source link

Kubeadm init - Does not complete #131

Closed DarkBlaez closed 7 years ago

DarkBlaez commented 7 years ago

(Are the deployment instructions even up to date? starting to see some steps left out)

Using Centos 7.3 with usual repo updates

Follow the deployment steps per : https://github.com/kubernetes/frakti/blob/master/docs/deploy.md

Using 3 VMs (1 master, 2 minons) or that's the plan

By the way the repos are updated with current 1.6.2, the "note" below the yum install references an older 1.6.0-beta 4

On master run kubeadm init --pod-network-cidr 10.244.0.0/16 --kubernetes-version stable (latest gives 1.7 alpha)

kubeadm does all of the preflight checks and as expected everything passes. Note there should be a step prior to running this that enables and starts kubelet service, otherwise the warning will pop up.

result... hung at: [apiclient] Created API client, waiting for the control plane to become ready

No docker images are pulled, no docker containers appear to be running or created

First time I have had issues with kubeadm under a normal docker based install over attempting a fresh Frakti one. These are all fresh VMs so nothing legacy on them other than Centos 7.3. FOllowed deployment instructions exactly.

DB

Also curious as to why this has two Hypervisor entries: refer to Hypervisor=libvert and Hypervisor=qemu, is this not redundant and would the config not read the second entry?

echo -e "Hypervisor=libvirt\n\ Kernel=/var/lib/hyper/kernel\n\ Initrd=/var/lib/hyper/hyper-initrd.img\n\ Hypervisor=qemu\n\ StorageDriver=overlay\n\ gRPCHost=127.0.0.1:22318" > /etc/hyper/config

Another thing to note is kubelet failed:

-- Unit kubelet.service has begun starting up. May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.355299 2726 feature_gate.go:144] feature gates: map[DynamicVolumeProvisioning:true TaintBasedE May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.359676 2726 server.go:232] Starting Kubelet configuration sync loop May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.359702 2726 server.go:407] failed to init dynamic Kubelet configuration sync: cloud provider w May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.375962 2726 manager.go:143] cAdvisor running in container: "/" May 01 18:14:33 kube01 kubelet[2726]: W0501 18:14:33.476000 2726 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.550699 2726 fs.go:117] Filesystem partitions: map[/dev/vda1:{mountpoint:/ major:253 minor:1 fs May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.552778 2726 manager.go:198] Machine: {NumCores:2 CpuFrequency:2399998 MemoryCapacity:397522534 May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.588629 2726 manager.go:204] Version: {KernelVersion:3.10.0-514.16.1.el7.x86_64 ContainerOsVers May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.589372 2726 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specified. de May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.591154 2726 container_manager_linux.go:245] container manager verified user specified cgroup-r May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.591188 2726 container_manager_linux.go:250] Creating Container Manager object based on Node Co May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.591400 2726 kubelet.go:255] Adding manifest file: /etc/kubernetes/manifests May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.591442 2726 kubelet.go:265] Watching apiserver May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.597608 2726 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:382: Failed to list v1 May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.597697 2726 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.597777 2726 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:390: Failed to list v1 May 01 18:14:33 kube01 kubelet[2726]: W0501 18:14:33.598688 2726 kubelet_network.go:63] Hairpin mode set to "promiscuous-bridge" but container runt May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.598721 2726 kubelet.go:494] Hairpin mode set to "none" May 01 18:14:33 kube01 kubelet[2726]: I0501 18:14:33.598922 2726 remote_runtime.go:41] Connecting to runtime service /var/run/frakti.sock May 01 18:14:33 kube01 kubelet[2726]: 2017/05/01 18:14:33 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "t May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.599459 2726 remote_runtime.go:63] Version from runtime service failed: rpc error: code = 14 de May 01 18:14:33 kube01 kubelet[2726]: 2017/05/01 18:14:33 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "t May 01 18:14:33 kube01 kubelet[2726]: E0501 18:14:33.600929 2726 kuberuntime_manager.go:154] Get runtime version failed: rpc error: code = 14 desc May 01 18:14:33 kube01 kubelet[2726]: error: failed to run Kubelet: failed to create kubelet: rpc error: code = 14 desc = grpc: the connection is unav May 01 18:14:33 kube01 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE May 01 18:14:33 kube01 systemd[1]: Unit kubelet.service entered failed state. May 01 18:14:33 kube01 systemd[1]: kubelet.service failed.

DarkBlaez commented 7 years ago

After some fiddling around the result is still the same with kubeadm init

Kubelet status May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.846252 3713 generic.go:239] PLEG: Ignoring events for pod kube-apiserver-kube01/ku...60e70b7a) May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.850218 3713 kuberuntime_manager.go:844] PodSandboxStatus of sandbox "7b2aadb9808afe2832cac1... May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.850253 3713 generic.go:269] PLEG: pod kube-controller-manager-kube01/kube-system f...980d6b39) May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.854176 3713 kuberuntime_manager.go:844] PodSandboxStatus of sandbox "4d17fa7f76bc2e64bdd35b... May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.854210 3713 generic.go:269] PLEG: pod kube-scheduler-kube01/kube-system failed rei...826ba668) May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.858562 3713 kuberuntime_manager.go:844] PodSandboxStatus of sandbox "c77e27a651c8cc77327f9b... May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.858606 3713 generic.go:269] PLEG: pod kube-apiserver-kube01/kube-system failed rei...60e70b7a) May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.862135 3713 kuberuntime_manager.go:844] PodSandboxStatus of sandbox "66a6786296fdd979e617bd... May 01 18:45:01 kube01 kubelet[3713]: E0501 18:45:01.862169 3713 generic.go:269] PLEG: pod etcd-kube01/kube-system failed reinspection:...9dcb1232) May 01 18:45:02 kube01 kubelet[3713]: E0501 18:45:02.481247 3713 pod_workers.go:182] Error syncing pod 8ec3ec3756a7e86993511a512dec786d ("kube-a...

Frakti status May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.212688 3084 sandbox.go:247] GetPodInfo for 4d17fa7f76bc2e64bdd35bce3e093f7851628c72...826ba668) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.212722 3084 manager.go:221] PodSandboxStatus from hyper runtime service failed: rpc...826ba668) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.216890 3084 sandbox.go:247] GetPodInfo for c77e27a651c8cc77327f9bea9e149d407574cec2...60e70b7a) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.216921 3084 manager.go:221] PodSandboxStatus from hyper runtime service failed: rpc...60e70b7a) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.221543 3084 sandbox.go:247] GetPodInfo for 66a6786296fdd979e617bde29202c106b710f76c...9dcb1232) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.221567 3084 manager.go:221] PodSandboxStatus from hyper runtime service failed: rpc...9dcb1232) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.224730 3084 sandbox.go:247] GetPodInfo for 7b2aadb9808afe2832cac1a516eb4077a0926d45...980d6b39) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.224751 3084 manager.go:221] PodSandboxStatus from hyper runtime service failed: rpc...980d6b39) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.228400 3084 sandbox.go:247] GetPodInfo for 4d17fa7f76bc2e64bdd35bce3e093f7851628c72...826ba668) May 01 18:47:39 kube01 frakti[3084]: E0501 18:47:39.228425 3084 manager.go:221] PodSandboxStatus from hyper runtime service failed: rpc...826ba668)

Hyperd status May 01 18:48:22 kube01 hyperd[739]: E0501 18:48:22.780955 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(c77e27a65...60e70b7a) May 01 18:48:22 kube01 hyperd[739]: E0501 18:48:22.783639 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(66a678629...9dcb1232) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.790125 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(7b2aadb98...980d6b39) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.793292 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(4d17fa7f7...826ba668) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.796667 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(c77e27a65...60e70b7a) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.799219 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(66a678629...9dcb1232) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.802302 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(4d17fa7f7...826ba668) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.804962 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(c77e27a65...60e70b7a) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.807810 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(66a678629...9dcb1232) May 01 18:48:23 kube01 hyperd[739]: E0501 18:48:23.810285 739 info.go:20] GetPodInfo error: Can not get Pod info with pod ID(7b2aadb98...980d6b39)

Again this is a new VM install on Centos 7.3 using the deployment instructions. Also noting that there are minor errors in the instructions as noted above.

Seems like something is missing here yet this should be fairly straight forward. I'd suggest reviewing the instructions for deployment, and using a fresh Centos 7.3 VM and stepping through them. I've dne this on 3 new VMs, all the same , all with the same result. Will wait to hear what the suggessted fix is.

Thanks DB

feiskyer commented 7 years ago

@DarkBlaez Thanks for the reporting. kubeadm indeed doesn't complete because of an bug. And steps related with 1.6.0-beta is not required now because there is already stable kubernetes v1.6.2 releases.

Made an update to the deploy steps (#132), could you have a try on the new deployments?

DarkBlaez commented 7 years ago

Thanks. I have spun up a fresh Centos 7.3 VM and will step through the deployment and repost my findings

DarkBlaez commented 7 years ago

Here's the results on a new 3 node install based on the deplpyment guide update.

On the master: kubeadm init kubeadm init --pod-network-cidr 10.244.0.0/16 --kubernetes-version stable Completes with no issues, provides join token as expected, however, kubelet.service status is as follows:

● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Tue 2017-05-02 15:28:09 UTC; 36min ago Docs: http://kubernetes.io/docs/ Main PID: 9720 (kubelet) CGroup: /system.slice/kubelet.service ├─9720 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/var/run/frakti.sock --feature-gates=AllAlpha=true └─9764 journalctl -k -f

May 02 16:03:40 kube01 kubelet[9720]: I0502 16:03:40.135783 9720 server.go:235] Checking API server for new Kubelet configuration. May 02 16:03:40 kube01 kubelet[9720]: I0502 16:03:40.144899 9720 server.go:244] Did not find a configuration for this Kubelet via API server: cloud provider was nil, and attempt to use hostname to find config resulted in: configmaps "kubelet-kube01" not found May 02 16:03:47 kube01 kubelet[9720]: E0502 16:03:47.212775 9720 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": ImagesFsInfo: unknown runtime: remote May 02 16:03:57 kube01 kubelet[9720]: E0502 16:03:57.264268 9720 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": ImagesFsInfo: unknown runtime: remote May 02 16:04:07 kube01 kubelet[9720]: E0502 16:04:07.313290 9720 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": ImagesFsInfo: unknown runtime: remote May 02 16:04:09 kube01 kubelet[9720]: I0502 16:04:09.806074 9720 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/102dbbc5-2f4c-11e7-932a-f2af9985b02d-kube-proxy-token-69f6g" (spec.Name: "kube-proxy-token-69f6g") pod "102dbbc5-2f4c-11e7-932a-f2af9985b02d" (UID: "102dbbc5-2f4c-11e7-932a-f2af9985b02d"). May 02 16:04:09 kube01 kubelet[9720]: I0502 16:04:09.806853 9720 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/configmap/102dbbc5-2f4c-11e7-932a-f2af9985b02d-kube-proxy" (spec.Name: "kube-proxy") pod "102dbbc5-2f4c-11e7-932a-f2af9985b02d" (UID: "102dbbc5-2f4c-11e7-932a-f2af9985b02d"). May 02 16:04:10 kube01 kubelet[9720]: I0502 16:04:10.135652 9720 server.go:235] Checking API server for new Kubelet configuration. May 02 16:04:10 kube01 kubelet[9720]: I0502 16:04:10.140137 9720 server.go:244] Did not find a configuration for this Kubelet via API server: cloud provider was nil, and attempt to use hostname to find config resulted in: configmaps "kubelet-kube01" not found May 02 16:04:17 kube01 kubelet[9720]: E0502 16:04:17.362751 9720 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": ImagesFsInfo: unknown runtime: remote

On the other 2 nodes (minons) a kubeadm join... (with proper token) results in what appears as a successful join. However, kubelet.service status yields:

● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Tue 2017-05-02 16:07:33 UTC; 9s ago Docs: http://kubernetes.io/docs/ Process: 3738 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CGROUP_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE) Main PID: 3738 (code=exited, status=1/FAILURE)

May 02 16:07:33 kube02 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE May 02 16:07:33 kube02 systemd[1]: Unit kubelet.service entered failed state. May 02 16:07:33 kube02 systemd[1]: kubelet.service failed.

The result of kubectl get nodes only shows the master and no other nodes.

Keep in mind this is again a fresh install on Centos 7.3, following deployment guide exactly and verifying each step, each config systematically. Results in a non-functional kubernetes cluster

DB

DarkBlaez commented 7 years ago

On master node reboot, kubelet services enters fail-state as well

Unit kubelet.service has begun start-up Defined-By: systemd

Unit kubelet.service has begun starting up. May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.393449 2519 feature_gate.go:144] feature gates: map[AllAlpha:true ExperimentalCriticalPodAnnot May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.397312 2519 server.go:232] Starting Kubelet configuration sync loop May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.397351 2519 server.go:407] failed to init dynamic Kubelet configuration sync: cloud provider w May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.408455 2519 manager.go:143] cAdvisor running in container: "/" May 02 16:21:33 kube01 kubelet[2519]: W0502 16:21:33.495198 2519 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.580092 2519 fs.go:117] Filesystem partitions: map[/dev/vda1:{mountpoint:/ major:253 minor:1 fs May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.586399 2519 manager.go:198] Machine: {NumCores:2 CpuFrequency:2399998 MemoryCapacity:397522534 May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.623134 2519 manager.go:204] Version: {KernelVersion:3.10.0-514.16.1.el7.x86_64 ContainerOsVers May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.623925 2519 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specified. de May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.625595 2519 container_manager_linux.go:245] container manager verified user specified cgroup-r May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.625623 2519 container_manager_linux.go:250] Creating Container Manager object based on Node Co May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.625832 2519 kubelet.go:255] Adding manifest file: /etc/kubernetes/manifests May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.625867 2519 kubelet.go:265] Watching apiserver May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.629827 2519 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:382: Failed to list v1 May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.629907 2519 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.629970 2519 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:390: Failed to list v1 May 02 16:21:33 kube01 kubelet[2519]: W0502 16:21:33.631755 2519 kubelet_network.go:63] Hairpin mode set to "promiscuous-bridge" but container runt May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.631789 2519 kubelet.go:494] Hairpin mode set to "none" May 02 16:21:33 kube01 kubelet[2519]: I0502 16:21:33.631984 2519 remote_runtime.go:41] Connecting to runtime service /var/run/frakti.sock May 02 16:21:33 kube01 kubelet[2519]: 2017/05/02 16:21:33 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "t May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.632431 2519 remote_runtime.go:63] Version from runtime service failed: rpc error: code = 14 de May 02 16:21:33 kube01 kubelet[2519]: E0502 16:21:33.632510 2519 kuberuntime_manager.go:154] Get runtime version failed: rpc error: code = 14 desc May 02 16:21:33 kube01 kubelet[2519]: error: failed to run Kubelet: failed to create kubelet: rpc error: code = 14 desc = grpc: the connection is unav May 02 16:21:33 kube01 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE May 02 16:21:33 kube01 systemd[1]: Unit kubelet.service entered failed state. May 02 16:21:33 kube01 systemd[1]: kubelet.service failed.

HYPERD Status (again all of this is after a VM reboot on master) ● hyperd.service - hyperd Loaded: loaded (/usr/lib/systemd/system/hyperd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2017-05-02 16:18:33 UTC; 8min ago Docs: http://docs.hypercontainer.io Main PID: 727 (hyperd) CGroup: /system.slice/hyperd.service └─727 /usr/bin/hyperd --log_dir=/var/log/hyper

May 02 16:18:33 kube01 hyperd[727]: time="2017-05-02T16:18:33Z" level=info msg="Loading containers: done." May 02 16:18:33 kube01 hyperd[727]: Qemu Driver Loaded May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.013844 727 qmp_handler.go:370] QMP initialize timeout May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.014044 727 vm_states.go:233] SB[vm-YvuyxwoIPM] QMP Init timeout May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.014194 727 vm_states.go:191] Shutting down because of an exception: connection to vm broken May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.360403 727 qmp_handler.go:164] failed to connected to /var/run/hyper/vm-YvuyxwoIPM/qmp.sock dial unix /var/run/hyper/vm-YvuyxwoIPM/qmp.sock: connect: no such file or directory May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.360424 727 json.go:135] Cannot connect to ctl socket unix:///var/run/hyper/vm-YvuyxwoIPM/hyper.sockdial unix /var/run/hyper/vm-YvuyxwoIPM/hyper.sock: connect: no such file or directory May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.360633 727 qmp_handler.go:363] QMP initialize failed May 02 16:18:44 kube01 hyperd[727]: E0502 16:18:44.360656 727 json.go:380] Cannot connect to stream socket unix:///var/run/hyper/vm-YvuyxwoIPM/tty.sockdial unix /var/run/hyper/vm-YvuyxwoIPM/tty.sock: connect: no such file or directory May 02 16:19:04 kube01 hyperd[727]: E0502 16:19:04.013533 727 hypervisor.go:49] SB[vm-YvuyxwoIPM] watch hyperstart timeout

FRAKTI status ● frakti.service - Hypervisor-based container runtime for Kubernetes Loaded: loaded (/usr/lib/systemd/system/frakti.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2017-05-02 16:18:33 UTC; 9min ago Docs: https://github.com/kubernetes/frakti Process: 730 ExecStart=/usr/bin/frakti --v=3 --log-dir=/var/log/frakti --logtostderr=false --listen=/var/run/frakti.sock --streaming-server-addr=%H --hyper-endpoint=127.0.0.1:22318 (code=exited, status=1/FAILURE) Main PID: 730 (code=exited, status=1/FAILURE)

May 02 16:18:33 kube01 systemd[1]: Started Hypervisor-based container runtime for Kubernetes. May 02 16:18:33 kube01 systemd[1]: Starting Hypervisor-based container runtime for Kubernetes... May 02 16:18:33 kube01 frakti[730]: 2017/05/02 16:18:33 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 127.0.0.1:22318: getsockopt: connection refused"; Reconnecting to {"127.0.0.1:22318" } May 02 16:18:33 kube01 frakti[730]: E0502 16:18:33.567837 730 kube_docker_client.go:97] failed to retrieve docker version: Cannot connect to the Docker daemon. Is the docker daemon running on this host? May 02 16:18:33 kube01 frakti[730]: E0502 16:18:33.568645 730 frakti.go:93] Initialize alternative runtime failed: failed to get info from docker: Cannot connect to the Docker daemon. Is the docker daemon running on this host? May 02 16:18:33 kube01 systemd[1]: frakti.service: main process exited, code=exited, status=1/FAILURE May 02 16:18:33 kube01 systemd[1]: Unit frakti.service entered failed state. May 02 16:18:33 kube01 systemd[1]: frakti.service failed.

feiskyer commented 7 years ago

On the other 2 nodes (minons) a kubeadm join... (with proper token) results in what appears as a successful join. However, kubelet.service status yields:

@DarkBlaez Could you provide the full status of kubelet? systemctl status -l kubelet and journalctl -u kubelet -l?

On master node reboot, kubelet services enters fail-state as well

Kubelet failed to start because of hyperd failure: hyperd refused to start after system rebooted. We will upload a new package to fix this.

feiskyer commented 7 years ago

@DarkBlaez By the way, just added an all-in-one deployment script cluster/allinone.sh.

DarkBlaez commented 7 years ago

Just a question, for the kubeadm init would it not be best to use 'stable' versus 'latest' , reason is latest installs the most bleeding edge released which ate usually alpha and beta stage. Or is this development only targeted for the very latest dev releases?

feiskyer commented 7 years ago

@DarkBlaez Good catch. Frakti is still on its early stage (alpha), it supports the latest version best. We will move to beta in kubernetes v1.7 release.

feiskyer commented 7 years ago

@DarkBlaez Do you still have problems on this?

DarkBlaez commented 7 years ago

We can close this as this was related to another issue