Closed vovkats closed 6 years ago
This doesn't seem like a gluster-kubernetes problem off-hand... is there a way you can view what's using resources on the given node? If not, at least see which pods are running on that node and how many resources those are consuming?
@jarrpa
Using this commandkubectl describe nodes node-3
I get net info:
Name: node-3
Roles: node
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=node-3
node-role.kubernetes.io/node=true
storagenode=glusterfs
Annotations: flannel.alpha.coreos.com/backend-data={"VtepMAC":"ca:23:ef:fa:d7:d5"}
flannel.alpha.coreos.com/backend-type=vxlan
flannel.alpha.coreos.com/kube-subnet-manager=true
flannel.alpha.coreos.com/public-ip=192.168.1.10
node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: <none>
CreationTimestamp: Wed, 11 Apr 2018 17:27:24 +0500
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Wed, 11 Apr 2018 18:04:51 +0500 Wed, 11 Apr 2018 17:27:21 +0500 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Wed, 11 Apr 2018 18:04:51 +0500 Wed, 11 Apr 2018 17:27:21 +0500 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 11 Apr 2018 18:04:51 +0500 Wed, 11 Apr 2018 17:27:21 +0500 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Wed, 11 Apr 2018 18:04:51 +0500 Wed, 11 Apr 2018 17:29:04 +0500 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.1.10
Hostname: node-3
Capacity:
cpu: 1
memory: 3881776Ki
pods: 110
Allocatable:
cpu: 900m
memory: 3529376Ki
pods: 110
System Info:
Machine ID: 609bbd29e32a4898e604f49bff82a88c
System UUID: D96D099F-0FEC-40CF-986D-9A5FB06AB29A
Boot ID: cd2c8e64-2217-4470-b5ae-b0a9d6641b67
Kernel Version: 3.10.0-693.11.6.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.2
Kubelet Version: v1.9.5
Kube-Proxy Version: v1.9.5
PodCIDR: 10.233.65.0/24
ExternalID: node-3
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default netchecker-agent-hostnet-s8p95 15m (1%) 30m (3%) 64M (1%) 100M (2%)
default netchecker-agent-ph72v 15m (1%) 30m (3%) 64M (1%) 100M (2%)
kube-system elasticsearch-logging-v1-776b8b856c-rnd4n 100m (11%) 1 (111%) 0 (0%) 0 (0%)
kube-system fluentd-es-v1.22-5vqb2 100m (11%) 0 (0%) 200Mi (5%) 200Mi (5%)
kube-system kube-dns-79d99cdcd5-6lxbm 260m (28%) 0 (0%) 110Mi (3%) 170Mi (4%)
kube-system kube-flannel-wk88b 150m (16%) 300m (33%) 64M (1%) 500M (13%)
kube-system kube-proxy-node-3 150m (16%) 500m (55%) 64M (1%) 2G (55%)
kube-system nginx-proxy-node-3 25m (2%) 300m (33%) 32M (0%) 512M (14%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
815m (90%) 2160m (240%) 613058560 (16%) 3599973120 (99%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 37m kubelet, node-3 Starting kubelet.
Normal NodeAllocatableEnforced 37m kubelet, node-3 Updated Node Allocatable limit across pods
Normal NodeHasSufficientDisk 37m (x8 over 37m) kubelet, node-3 Node node-3 status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 37m (x8 over 37m) kubelet, node-3 Node node-3 status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 37m (x7 over 37m) kubelet, node-3 Node node-3 status is now: NodeHasNoDiskPressure
Looks pretty self-explanatory, then: You have so much CPU request on the node that there is not enough for the CPU request for the GlusterFS pod. Your options are:
glusterfs-daemonset.yml
manifest to specify a different (or none) CPU request.@jarrpa I want to clarify. Is it important to have minimum 3 nodes for installing gluster?
Yes. For testing/hacking purposes, you can run gk-deploy with the --single-node
argument to remove this restriction.
@jarrpa Can you explain item "Edit the glusterfs-daemonset.yml manifest to specify a different (or none) CPU request." ?
@jarrpa thanks
@jarrpa After changing configuration I have got error:
Creating node node-3 ... Unable to create node: Unable to execute command on glusterfs-8kfm2: peer probe: failed: Probe returned with Transport endpoint is not connected
Error loading the cluster topology.
But node is available.
@vovkats
Could you check all ports are opened properly. Refer this guide: https://github.com/gluster/gluster-kubernetes/blob/master/docs/setup-guide.md
Also, check this comment may help: https://github.com/gluster/gluster-kubernetes/issues/250#issuecomment-296355028
@SaravanaStorageNetwork ports 1-50000 are opened. Also I have run
sudo iptables -I INPUT -p all -j ACCEPT
but It does not help me.
@vovkats
Check kubectl get nodes - verify all is fine here kubectl get pods - check all pods especially gluster pods are running fine.
if any issue, check kubectl describe pod \<podname>
Check whether connectivity between the nodes works fine.
Additionally, you can abort the entire setup using gk-deploy --abort and try re-running again.
@SaravanaStorageNetwork @jarrpa I've updated heketi and after running command:
./gk-deploy -gvy
I get error:
Checking status of pods matching '--selector=deploy-heketi=pod':
deploy-heketi-7c4898d9cd-dwhhs 0/1 Error 6 5m
or
Checking status of pods matching '--selector=deploy-heketi=pod':
deploy-heketi-7c4898d9cd-dwhhs 0/1 CrashLoopBackOff 6 10m
Command kubectl logs deploy-heketi-7c4898d9cd-dwhhs
returns this result
standard_init_linux.go:178: exec user process caused "exec format error
Also when I try to re-run command: ./gk-deploy -gvy
I get error: Can't open /dev/vdc exclusively. Mounted filesystem?
You can't run gk-deploy more than once for any given deployment. You have to do gk-deploy --abort
and wipe all the storage devices before running again. You also want to check the kubectl describe
output for the pod and see if it shows anything as well.
@vovkats do you happen to be working in a closed environment?
Given the silence of the OP, closhing this issue. We can reopen this if the OP returns.
Not sure if relevant, but I manage to have this working ONLY when I force heketi-deployment to run in the master node and use the kube-system namespace. Hope that helps.
When I run command
./gk-deploy -g
I get result:I have next info in my dashboard
But I entered on node and this node has enough resources.