Closed idankish closed 2 years ago
@idankish I proposed a temporary fix. I hope it helps. See #437
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
Hi there, I deployed the "Guestbook-Go example" deployment on a K8S cluster, consists of a master node and 2 worker nodes. In this case the "redis-master" pod was deployed on "node-1" worker node and the 2 "redis-slave" pods were deployed on each worker node respectively. Please see below:
[root@master-node ~]# kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default guestbook-8w7ld 1/1 Running 0 36h 10.36.0.4 node-2
default guestbook-bk8zn 1/1 Running 0 36h 10.44.0.3 node-1
default guestbook-pvh77 1/1 Running 0 36h 10.36.0.3 node-2
default redis-master-sgbch 1/1 Running 0 36h 10.36.0.1 node-2
default redis-slave-r7kbk 1/1 Running 0 36h 10.44.0.2 node-1
default redis-slave-sxqfm 1/1 Running 0 36h 10.36.0.2 node-2
Everything seems to be working, but the problem that I am facing now is that the "redis slave" pods are not able to connect to the "redis master" pod.
[root@master-node ~]# kubectl logs -f redis-slave-r7kbk [8] 04 Jan 08:40:01.708 # Unable to connect to MASTER: Connection timed out [8] 04 Jan 08:40:02.711 Connecting to MASTER redis-master:6379 [8] 04 Jan 08:40:22.732 # Unable to connect to MASTER: Connection timed out [8] 04 Jan 08:40:23.734 Connecting to MASTER redis-master:6379
[root@master-node ~]# kubectl logs -f redis-slave-sxqfm 9] 04 Jan 08:40:29.719 Connecting to MASTER redis-master:6379 [9] 04 Jan 08:40:49.736 # Unable to connect to MASTER: Connection timed out [9] 04 Jan 08:40:50.739 Connecting to MASTER redis-master:6379 [9] 04 Jan 08:41:10.759 # Unable to connect to MASTER: Connection timed out
This is while, I can see that the "redis-master" pod seems to be up and running and listening to port 6379:
[root@master-node ~]# kubectl logs -f redis-master-sgbch . _.-
__ ''-._ _.-
.
. ''-. Redis 2.8.19 (00000000/0) 64 bit .-.-```. ```\/ _.,_ ''-._ ( ' , .-` | `, ) Running in stand alone mode |`-._`-...-` __...-.
-.|'` .-'| Port: 6379 |-._
. / .-' | PID: 1-._
-. `-./ .-' .-' |`-.-._
-..-' .-'.-'| |-._
-. .-'.-' | http://redis.io `-.-._
-..-'.-' .-' |-._
-._-.__.-' _.-'_.-'| |
-.`-. .-'.-' |-._
-._-.__.-'_.-' _.-'
-._-.__.-' _.-'
-. .-' `-.__.-'[1] 02 Jan 20:02:41.291 # Server started, Redis version 2.8.19 [1] 02 Jan 20:02:41.292 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. [1] 02 Jan 20:02:41.292 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. [1] 02 Jan 20:02:41.292 * The server is now ready to accept connections on port 6379
I followed the exact instructions and used the configuration files to deploy the services, but yet I can't figure out what is wrong: Here is the output of the "get svc":
[root@master-node ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE guestbook LoadBalancer 10.110.94.208 194.233.163.249 3000:30508/TCP 36h kubernetes ClusterIP 10.96.0.1 443/TCP 2d11h
redis-master ClusterIP 10.105.56.60 6379/TCP 36h
redis-slave ClusterIP 10.105.1.8 6379/TCP 36h
[root@master-node ~]# kubectl describe svc redis-master Name: redis-master Namespace: default Labels: app=redis role=master Annotations:
Selector: app=redis,role=master
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.105.56.60
IPs: 10.105.56.60
Port: 6379/TCP
TargetPort: redis-server/TCP
Endpoints: 10.36.0.1:6379
Session Affinity: None
Events:
While trying to troubleshoot it, I also tried to telnet to the "redis-master" pod's internal IP and port (10.36.0.1:6379) from both worker nodes. The port is reachable from the worker node, where the "redis-master" pod is running:
[root@node-2 ~]# telnet 10.36.0.1 6379 Trying 10.36.0.1... Connected to 10.36.0.1. Escape character is '^]'.
But not reachable from the second worker node:
[root@node-1 ~]# telnet 10.36.0.1 6379 Trying 10.36.0.1... telnet: connect to address 10.36.0.1: Connection timed out
^] telnet> q Connection closed.
Even ping to the "redis-master" pod's internal IP address from worker node-1 doesn't work:
[root@node-1 ~]# ping 10.36.0.1 PING 10.36.0.1 (10.36.0.1) 56(84) bytes of data.
^C --- 10.36.0.1 ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3051ms
I also disabled the internal CetnoOS firewall from both worker nodes, but yet with no success.
Any idea what can be wrong? Or what I am missing?
Thanks, Idan