Closed neo3matrix closed 8 months ago
@neo3matrix
To narrow down the issue, can you ping the pod ips from the boxes? For example, run nginx pod in any namespace and then try to ping this ip from any worker node. Does it work? This will help to eliminate any routing/firewall/calico issues.
Are you nodes on VMWARE Infra?
@neo3matrix
To narrow down the issue, can you ping the pod ips from the boxes? For example, run nginx pod in any namespace and then try to ping this ip from any worker node. Does it work? This will help to eliminate any routing/firewall/calico issues.
Are you nodes on VMWARE Infra?
@sohnaeo Thank you for your quick reply.
No, my nodes are physical servers not on VMWARE.
try to ping this ip from any worker node.
Yes, ping to the nginx pod's IP works fine from every worker node.
Can anyone please help?
@neo3matrix To narrow down the issue, can you ping the pod ips from the boxes? For example, run nginx pod in any namespace and then try to ping this ip from any worker node. Does it work? This will help to eliminate any routing/firewall/calico issues. Are you nodes on VMWARE Infra?
@sohnaeo Thank you for your quick reply.
No, my nodes are physical servers not on VMWARE.
try to ping this ip from any worker node.
Yes, ping to the nginx pod's IP works fine from every worker node.
Sorry for the late reply. Can you try to issue below command against your nginx ingress node port. Are you running nginx ingress on node port?
curl 127.0.0.1:33000 --header 'Host: youringressslink.com'
33000 is ingress node port at my end, change as per your env Host: changeas per your domain
@sohnaeo Hi, No, I am not running nginx ingress on node port. I am running it on load balancer. Nginx ingress is a LoadBalancer service and metalLB load balancer gives it an external IP address from my (unused IP) subnet pool.
@sohnaeo Hi, No, I am not running nginx ingress on node port. I am running it on load balancer. Nginx ingress is a LoadBalancer service and metalLB load balancer gives it an external IP address from my (unused IP) subnet pool.
In this case, I will take one step back. Lets assume you have nginx pod installed. Can you get its ip and run curl on any node you should get HTTP response nginx welcome page. I would also create service and try to access nginx welcome page against service to make sure kube -proxy works. Nginx forwards the request to Service so service should be accessible. If both pod ips and service are working then defintely calico networking is fine
If both pod ip/service works then we can look further look into ingress service then.
@sohnaeo Hi,
404 not found
- which is default message).
So, curl command works from all 3 nodes on nginx pod IP.@sohnaeo Hi,
- Got my nginx pod IP and tried curl from all 3 nodes to this one - I get the welcome message from nginx. (
404 not found
- which is default message). So, curl command works from all 3 nodes on nginx pod IP.- I didn't get the service part you asked for. Could you please give me a rough example of what are you expecting when you said "I would also create service and try to access nginx welcome page".
below commands will do the job
kubectl create deployment nginx --image=nginx --port=80
kubectl expose deployment nginx
kubectl get svc
curl http://svc_ip
https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-intro/
@sohnaeo Thank you for giving that pointer.
Yes, I confirm that curl http://<pod_IP>
as well as curl http://svc_ip
- both these curl commands work successfully on all 3 K8s nodes.
So, if the calico network works fine, then here's my nginx ingress controller service to debug:
$ kubectl describe svc nginx-stable-nginx-ingress -n nginx-stable Name: nginx-stable-nginx-ingress Namespace: nginx-stable Labels: app.kubernetes.io/instance=nginx-stable app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=nginx-stable-nginx-ingress helm.sh/chart=nginx-ingress-0.16.1 Annotations: meta.helm.sh/release-name: nginx-stable meta.helm.sh/release-namespace: nginx-stable metallb.universe.tf/ip-allocated-from-pool: first-pool Selector: app=nginx-stable-nginx-ingress Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 10.233.54.186 IPs: 10.233.54.186 LoadBalancer Ingress:
My-external-IP-from-subnet
Port: http 80/TCP TargetPort: 80/TCP NodePort: http 32438/TCP Endpoints: 10.233.95.243:80 Port: https 443/TCP TargetPort: 443/TCP NodePort: https 30686/TCP Endpoints: 10.233.95.243:443 Session Affinity: None External Traffic Policy: Local HealthCheck NodePort: 31540 Events:
@sohnaeo Thank you for giving that pointer.
Yes, I confirm that
curl http://<pod_IP>
as well ascurl http://svc_ip
- both these curl commands work successfully on all 3 K8s nodes.
Are there any networking policies in place?
kubectl get netpols -A
if not then I have doubt about your metalLB setup.
I don't see any network policy here:
$ kubectl get netpols -A error: the server doesn't have a resource type "netpols"
kubectl get netpols -A
Sorry typo,
kubectl get netpol -A
$ kubectl get netpol -A No resources found
Looks like there isn't any.
$ kubectl get netpol -A No resources found
Looks like there isn't any.
What error you gte when you browse to the ingress link? 404 or timeout? Please check your metaLB setup. it seems networking issues between nginx ingress and metaLB .
It's little strange. From some servers, I can get 404 error but from some servers (or even from my laptop), I am getting timeout.
Does kubespray by default sets up any network policy during installation? Let me paste my metalLB config here in next comment.
It's little strange. From some servers, I can get 404 error but from some servers (or even from my laptop), I am getting timeout.
Does kubespray by default sets up any network policy during installation? Let me paste my metalLB config here in next comment.
No, kubespray doesnt deploy any network policy by default. How do you connect from your laptop to metalb check that network segment.
$ cat roles/metallb-loadbalancer/templates/l2advertisement.yaml
apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: {{ k8s_metallb_release_name }}
$ cat roles/metallb-loadbalancer/templates/ipaddresspool.yaml
apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: {{ k8s_metallb_release_name }} spec: addresses: {{ ip_range_array }}
/usr/local/bin/helm install "{{ k8s_metallb_release_name }}" --create-namespace --namespace="{{ k8s_metallb_release_name }}" metallb/metallb --wait
$ cat roles/metallb-loadbalancer/templates/l2advertisement.yaml
apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: {{ k8s_metallb_release_name }}
$ cat roles/metallb-loadbalancer/templates/ipaddresspool.yaml
apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: {{ k8s_metallb_release_name }} spec: addresses: {{ ip_range_array }}
/usr/local/bin/helm install "{{ k8s_metallb_release_name }}" --create-namespace --namespace="{{ k8s_metallb_release_name }}" metallb/metallb --wait
are you using core dns? If you are getting timeout then it is not an DNS issue I believe. you can test DNS by running busybox and use nslookup utility to resolve the A record.
@sohnaeo Yes, the coredns pods are running in kube-system. I am not explicitly doing anything there. It came as part of default installation. The A record resolves fine when I ping or do nslookup - from within or outside of cluster. It's only http requests that are getting timeout.
@sohnaeo Yes, the coredns pods are running in kube-system. I am not explicitly doing anything there. It came as part of default installation. The A record resolves fine when I ping or do nslookup - from within or outside of cluster. It's only http requests that are getting timeout.
What about metalb pool ips? are these routable ips on all machines? Was it working before? Did it break after any event upgrade etc?
Let me check on that area. Also, let me try another version of metalLB helm chart. I will update this post soon.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Can you still reproduce this ?
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
Issue: My http requests to nginx ingress controller doesn't reach to nginx ingress contrller running on k8s cluster. After running traceroute and other commands, I suspect something within k8s DNS(?) or networking is prohibiting my http requests from reaching to nginx ingrress contrller.
General setup: I have 114 servers in my data center. I have 2 different k8s cluster setups via kubespray on 3 servers each. (3 servers for cluster1 & 3 servers for cluster2).
On each k8s cluster:
I have observed that, not all other servers from my DC can send http request to my nginx ingress in either clusters. All these servers can ping the cluster as well as
my-nginx1.mycompany.com
ORmy-nginx1.mycompany.com
DNS just fine. No firewall OR networking issues ( I confirm). Also, Few servers can send http request to cluster1 but almost only couple of servers can send http request to cluster2 even though the setup is EXACTLY same - except for external IP & DNS. With traceroute and other commands, I suspect something in kubernetes DNS(?) is causing the problem. No logs in nginx ingress controller pod -as if the request didn't reach to it.Can someone please help me?
Environment:
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6", GitCommit:"b39bf148cd654599a52e867485c02c4f9d28b312", GitTreeState:"clean", BuildDate:"2022-09-21T13:12:04Z", GoVersion:"go1.18.6", Compiler:"gc", Platform:"linux/amd64"}
Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"