kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
15.86k stars 6.41k forks source link

not able to communicate to pods from node-1 to the pods on node-2 #9601

Open rufy2022 opened 1 year ago

rufy2022 commented 1 year ago

Environment:

Kubespray version (commit) (git rev-parse --short HEAD): 491e260d

Network plugin used: calico

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible: ansible-playbook -i inventory/mycluster/inventory.ini --become --user=root --become-user=root cluster.yml

Output of ansible run:

Anything else do we need to know:

[all] master0 ansible_host=192.168.50.120 ip=192.168.50.120 node1 ansible_host=192.168.50.121 ip=192.168.50.121 node2 ansible_host=192.168.50.122 ip=192.168.50.122

[kube_control_plane] master0

[etcd] master0

[kube_node] node1 node2

[calico_rr]

[k8s_cluster:children] kube_control_plane kube_node calico_rr

Hello,

i have noticed on a freshly installed kubernetes cluster, iam not able to communicate to pods from node-1 to the pods on node-2 and node-xxx. Due to this problem, the dns resolv works sometimes and sometimes not. I have just copied the sample inventory and adjusted my ips and disabled nodelocaldns, even with enabled nodelocaldns was not working. Looks like the routing is not working properly. Is this known issue or iam missing something?

rufy2022 commented 1 year ago

ok, it looks like some bug in the network playbook! I have changed from calico to cni and then installed manually the latest calico v3.24.5. Now the routing table on all nodes looks good as expected and everything is working fine.

Please fix the network bug in the playbook, currently it is useless, atleast on the latest Debian 11.

oomichi commented 1 year ago

ok, it looks like some bug in the network playbook! I have changed from calico to cni and then installed manually the latest calico v3.24.5. Now the routing table on all nodes looks good as expected and everything is working fine.

Please fix the network bug in the playbook, currently it is useless, atleast on the latest Debian 11.

Thank you for submitting this issue. Could you provide more information about the issue? The latest Kubespray installs calico v3.24.5 which is the same as you installed manually. So I am not sure why you solve this issue by yourself according to current information.

rufy2022 commented 1 year ago

@oomichi see i have just now pulled the latest changes on the git repo and installed freshly new kubernetes cluster with 3 vms, 1 master and 2 worker nodes. The routing table is still wrong, simple ping to google.com or ping to kubernetes is not at all working. Still the cross communication from one pod to pod in different node doesnt work. So the calico ansible playbook is doing something wrong, as i said before, when i select cni and later install calico manually, everything works perfect.

Here output on freshly installed master: root@master0: route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.50.1 0.0.0.0 UG 0 0 0 ens192 10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico 10.233.102.128 10.233.102.128 255.255.255.192 UG 0 0 0 vxlan.calico 10.233.105.64 0.0.0.0 255.255.255.192 U 0 0 0 * 10.233.105.65 0.0.0.0 255.255.255.255 UH 0 0 0 cali2208d19a336 10.233.105.66 0.0.0.0 255.255.255.255 UH 0 0 0 calif9130857f3d 192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192

root@master0:~# kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dnstools If you don't see a command prompt, try pressing enter. dnstools# ping google.com ^C dnstools# ping google.com ^C dnstools# ping google.com ^C dnstools# ping kubernetes ^C dnstools#

Routing table node-1: root@node1: route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.50.1 0.0.0.0 UG 0 0 0 ens192 10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico 10.233.102.128 0.0.0.0 255.255.255.192 U 0 0 0 * 10.233.102.129 0.0.0.0 255.255.255.255 UH 0 0 0 cali965a9efe442 10.233.102.130 0.0.0.0 255.255.255.255 UH 0 0 0 cali259c5d799e6 10.233.105.64 10.233.105.64 255.255.255.192 UG 0 0 0 vxlan.calico 192.168.50.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192

Here the routing from the cluster, where i installed calico manually: root@kubernetes-dev-master01:~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.80.81.1 0.0.0.0 UG 0 0 0 ens192 10.80.81.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192 10.233.100.192 0.0.0.0 255.255.255.192 U 0 0 0 * 10.233.100.193 0.0.0.0 255.255.255.255 UH 0 0 0 cali36f9a373011 10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0 10.233.105.128 10.80.81.51 255.255.255.192 UG 0 0 0 tunl0

node-1 root@kubernetes-dev-node01: route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.80.81.1 0.0.0.0 UG 0 0 0 ens192 10.80.81.0 0.0.0.0 255.255.255.0 U 0 0 0 ens192 10.233.100.192 10.80.81.50 255.255.255.192 UG 0 0 0 tunl0 10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0 10.233.105.128 0.0.0.0 255.255.255.192 U 0 0 0 * 10.233.105.129 0.0.0.0 255.255.255.255 UH 0 0 0 caliad3d97130af 10.233.105.130 0.0.0.0 255.255.255.255 UH 0 0 0 cali952936de682 10.233.105.132 0.0.0.0 255.255.255.255 UH 0 0 0 cali46926efb26c 10.233.105.133 0.0.0.0 255.255.255.255 UH 0 0 0 cali82dc20a0498 Do you see the difference?

On the nodes with ansible calico, the routing table is wrong! master -> e.g. 10.233.75.0 10.233.75.0 255.255.255.192 UG 0 0 0 vxlan.calico node-1 -> e.g. 10.233.105.64 10.233.105.64 255.255.255.192 UG 0 0 0 vxlan.calico

On the working nodes without ansible calico: master -> e.g. 10.233.104.64 10.80.81.52 255.255.255.192 UG 0 0 0 tunl0 node-1 -> e.g. 10.233.100.192 10.80.81.50 255.255.255.192 UG 0 0 0 tunl0

You can see the problem now right? The ansible playbook is doing something not correctly.

rufy2022 commented 1 year ago

@kerryeon i see you worked also on the calico ansible task. Can you check the above issue?

HoKim98 commented 1 year ago

Hello, could you attach your Full inventory with variables? I see your problems, but the information is lacking.

rufy2022 commented 1 year ago

@kerryeon attached, please check. I wounder that no one else noticed this bug. master0 | SUCCESS => { "hostvars[inventory_hostname]": { "ansible_check_mode": false, "ansible_config_file": "/root/kubespray/ansible.cfg", "ansible_diff_mode": false, "ansible_facts": {}, "ansible_forks": 5, "ansible_host": "192.168.50.120", "ansible_inventory_sources": [ "/root/kubespray/inventory/mycluster/inventory.ini" ], "ansible_playbook_python": "/usr/bin/python3", "ansible_verbosity": 0, "ansible_version": { "full": "2.12.5", "major": 2, "minor": 12, "revision": 5, "string": "2.12.5" }, "argocd_enabled": false, "auto_renew_certificates": false, "bin_dir": "/usr/local/bin", "calico_cni_name": "k8s-pod-network", "calico_pool_blocksize": 26, "cephfs_provisioner_enabled": false, "cert_manager_enabled": false, "cluster_name": "cluster.local", "container_manager": "containerd", "coredns_k8s_external_zone": "k8s_external.local", "credentials_dir": "/root/kubespray/inventory/mycluster/credentials", "default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir", "deploy_netchecker": false, "dns_domain": "cluster.local", "dns_mode": "coredns", "docker_bin_dir": "/usr/bin", "docker_container_storage_setup": false, "docker_daemon_graph": "/var/lib/docker", "docker_dns_servers_strict": false, "docker_iptables_enabled": "false", "docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5", "docker_rpm_keepcache": 1, "enable_coredns_k8s_endpoint_pod_names": false, "enable_coredns_k8s_external": false, "enable_dual_stack_networks": false, "enable_nat_default_gateway": true, "enable_nodelocaldns": false, "enable_nodelocaldns_secondary": false, "etcd_data_dir": "/var/lib/etcd", "etcd_deployment_type": "host", "event_ttl_duration": "1h0m0s", "group_names": [ "etcd", "k8s_cluster", "kube_control_plane" ], "groups": { "all": [ "master0", "node1", "node2" ], "calico_rr": [], "etcd": [ "master0" ], "k8s_cluster": [ "master0", "node1", "node2" ], "kube_control_plane": [ "master0" ], "kube_node": [ "node1", "node2" ], "ungrouped": [] }, "helm_enabled": false, "ingress_alb_enabled": false, "ingress_nginx_enabled": false, "ingress_publish_status_address": "", "inventory_dir": "/root/kubespray/inventory/mycluster", "inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini", "inventory_hostname": "master0", "inventory_hostname_short": "master0", "ip": "192.168.50.120", "k8s_image_pull_policy": "IfNotPresent", "kata_containers_enabled": false, "krew_enabled": false, "krew_root_dir": "/usr/local/krew", "kube_api_anonymous_auth": true, "kube_apiserver_ip": "10.233.0.1", "kube_apiserver_port": 6443, "kube_cert_dir": "/etc/kubernetes/ssl", "kube_cert_group": "kube-cert", "kube_config_dir": "/etc/kubernetes", "kube_encrypt_secret_data": false, "kube_log_level": 2, "kube_manifest_dir": "/etc/kubernetes/manifests", "kube_network_node_prefix": 24, "kube_network_node_prefix_ipv6": 120, "kube_network_plugin": "calico", "kube_network_plugin_multus": false, "kube_ovn_default_gateway_check": true, "kube_ovn_default_logical_gateway": false, "kube_ovn_default_vlan_id": 100, "kube_ovn_dpdk_enabled": false, "kube_ovn_enable_external_vpc": true, "kube_ovn_enable_lb": true, "kube_ovn_enable_np": true, "kube_ovn_enable_ssl": false, "kube_ovn_encap_checksum": true, "kube_ovn_external_address": "8.8.8.8", "kube_ovn_external_address_ipv6": "2400:3200::1", "kube_ovn_external_dns": "alauda.cn", "kube_ovn_hw_offload": false, "kube_ovn_network_type": "geneve", "kube_ovn_node_switch_cidr": "100.64.0.0/16", "kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64", "kube_ovn_pod_nic_type": "veth_pair", "kube_ovn_traffic_mirror": false, "kube_ovn_tunnel_type": "geneve", "kube_ovn_vlan_name": "product", "kube_owner": "kube", "kube_pods_subnet": "10.233.64.0/18", "kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112", "kube_proxy_mode": "iptables", "kube_proxy_nodeport_addresses": [], "kube_proxy_strict_arp": true, "kube_script_dir": "/usr/local/bin/kubernetes-scripts", "kube_service_addresses": "10.233.0.0/18", "kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116", "kube_token_dir": "/etc/kubernetes/tokens", "kube_version": "v1.25.5", "kube_webhook_token_auth": false, "kube_webhook_token_auth_url_skip_tls_verify": false, "kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d", "kubeadm_patches": { "dest_dir": "/etc/kubernetes/patches", "enabled": false, "source_dir": "/root/kubespray/inventory/mycluster/patches" }, "kubernetes_audit": false, "loadbalancer_apiserver_healthcheck_port": 8081, "loadbalancer_apiserver_port": 6443, "local_path_provisioner_enabled": false, "local_release_dir": "/tmp/releases", "local_volume_provisioner_enabled": false, "macvlan_interface": "eth1", "metallb_enabled": false, "metallb_speaker_enabled": false, "metrics_server_enabled": true, "ndots": 2, "no_proxy_exclude_workers": false, "nodelocaldns_bind_metrics_host_ip": false, "nodelocaldns_health_port": 9254, "nodelocaldns_ip": "169.254.25.10", "nodelocaldns_second_health_port": 9256, "nodelocaldns_secondary_skew_seconds": 5, "ntp_enabled": false, "ntp_manage_config": false, "ntp_servers": [ "0.pool.ntp.org iburst", "1.pool.ntp.org iburst", "2.pool.ntp.org iburst", "3.pool.ntp.org iburst" ], "omit": "omit_place_holdered2aedc52f45dc99038ab78567807d04d1b029c6", "persistent_volumes_enabled": false, "playbook_dir": "/root/kubespray", "podsecuritypolicy_enabled": false, "rbd_provisioner_enabled": false, "registry_enabled": false, "resolvconf_mode": "host_resolvconf", "retry_stagger": 5, "skydns_server": "10.233.0.3", "skydns_server_secondary": "10.233.0.4", "unsafe_show_logs": false, "volume_cross_zone_attachment": false } } node1 | SUCCESS => { "hostvars[inventory_hostname]": { "ansible_check_mode": false, "ansible_config_file": "/root/kubespray/ansible.cfg", "ansible_diff_mode": false, "ansible_facts": {}, "ansible_forks": 5, "ansible_host": "192.168.50.121", "ansible_inventory_sources": [ "/root/kubespray/inventory/mycluster/inventory.ini" ], "ansible_playbook_python": "/usr/bin/python3", "ansible_verbosity": 0, "ansible_version": { "full": "2.12.5", "major": 2, "minor": 12, "revision": 5, "string": "2.12.5" }, "argocd_enabled": false, "auto_renew_certificates": false, "bin_dir": "/usr/local/bin", "calico_cni_name": "k8s-pod-network", "calico_pool_blocksize": 26, "cephfs_provisioner_enabled": false, "cert_manager_enabled": false, "cluster_name": "cluster.local", "container_manager": "containerd", "coredns_k8s_external_zone": "k8s_external.local", "credentials_dir": "/root/kubespray/inventory/mycluster/credentials", "default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir", "deploy_netchecker": false, "dns_domain": "cluster.local", "dns_mode": "coredns", "docker_bin_dir": "/usr/bin", "docker_container_storage_setup": false, "docker_daemon_graph": "/var/lib/docker", "docker_dns_servers_strict": false, "docker_iptables_enabled": "false", "docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5", "docker_rpm_keepcache": 1, "enable_coredns_k8s_endpoint_pod_names": false, "enable_coredns_k8s_external": false, "enable_dual_stack_networks": false, "enable_nat_default_gateway": true, "enable_nodelocaldns": false, "enable_nodelocaldns_secondary": false, "etcd_data_dir": "/var/lib/etcd", "etcd_deployment_type": "host", "event_ttl_duration": "1h0m0s", "group_names": [ "k8s_cluster", "kube_node" ], "groups": { "all": [ "master0", "node1", "node2" ], "calico_rr": [], "etcd": [ "master0" ], "k8s_cluster": [ "master0", "node1", "node2" ], "kube_control_plane": [ "master0" ], "kube_node": [ "node1", "node2" ], "ungrouped": [] }, "helm_enabled": false, "ingress_alb_enabled": false, "ingress_nginx_enabled": false, "ingress_publish_status_address": "", "inventory_dir": "/root/kubespray/inventory/mycluster", "inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini", "inventory_hostname": "node1", "inventory_hostname_short": "node1", "ip": "192.168.50.121", "k8s_image_pull_policy": "IfNotPresent", "kata_containers_enabled": false, "krew_enabled": false, "krew_root_dir": "/usr/local/krew", "kube_api_anonymous_auth": true, "kube_apiserver_ip": "10.233.0.1", "kube_apiserver_port": 6443, "kube_cert_dir": "/etc/kubernetes/ssl", "kube_cert_group": "kube-cert", "kube_config_dir": "/etc/kubernetes", "kube_encrypt_secret_data": false, "kube_log_level": 2, "kube_manifest_dir": "/etc/kubernetes/manifests", "kube_network_node_prefix": 24, "kube_network_node_prefix_ipv6": 120, "kube_network_plugin": "calico", "kube_network_plugin_multus": false, "kube_ovn_default_gateway_check": true, "kube_ovn_default_logical_gateway": false, "kube_ovn_default_vlan_id": 100, "kube_ovn_dpdk_enabled": false, "kube_ovn_enable_external_vpc": true, "kube_ovn_enable_lb": true, "kube_ovn_enable_np": true, "kube_ovn_enable_ssl": false, "kube_ovn_encap_checksum": true, "kube_ovn_external_address": "8.8.8.8", "kube_ovn_external_address_ipv6": "2400:3200::1", "kube_ovn_external_dns": "alauda.cn", "kube_ovn_hw_offload": false, "kube_ovn_network_type": "geneve", "kube_ovn_node_switch_cidr": "100.64.0.0/16", "kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64", "kube_ovn_pod_nic_type": "veth_pair", "kube_ovn_traffic_mirror": false, "kube_ovn_tunnel_type": "geneve", "kube_ovn_vlan_name": "product", "kube_owner": "kube", "kube_pods_subnet": "10.233.64.0/18", "kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112", "kube_proxy_mode": "iptables", "kube_proxy_nodeport_addresses": [], "kube_proxy_strict_arp": true, "kube_script_dir": "/usr/local/bin/kubernetes-scripts", "kube_service_addresses": "10.233.0.0/18", "kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116", "kube_token_dir": "/etc/kubernetes/tokens", "kube_version": "v1.25.5", "kube_webhook_token_auth": false, "kube_webhook_token_auth_url_skip_tls_verify": false, "kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d", "kubeadm_patches": { "dest_dir": "/etc/kubernetes/patches", "enabled": false, "source_dir": "/root/kubespray/inventory/mycluster/patches" }, "kubernetes_audit": false, "loadbalancer_apiserver_healthcheck_port": 8081, "loadbalancer_apiserver_port": 6443, "local_path_provisioner_enabled": false, "local_release_dir": "/tmp/releases", "local_volume_provisioner_enabled": false, "macvlan_interface": "eth1", "metallb_enabled": false, "metallb_speaker_enabled": false, "metrics_server_enabled": true, "ndots": 2, "no_proxy_exclude_workers": false, "nodelocaldns_bind_metrics_host_ip": false, "nodelocaldns_health_port": 9254, "nodelocaldns_ip": "169.254.25.10", "nodelocaldns_second_health_port": 9256, "nodelocaldns_secondary_skew_seconds": 5, "ntp_enabled": false, "ntp_manage_config": false, "ntp_servers": [ "0.pool.ntp.org iburst", "1.pool.ntp.org iburst", "2.pool.ntp.org iburst", "3.pool.ntp.org iburst" ], "omit": "omit_place_holdered2aedc52f45dc99038ab78567807d04d1b029c6", "persistent_volumes_enabled": false, "playbook_dir": "/root/kubespray", "podsecuritypolicy_enabled": false, "rbd_provisioner_enabled": false, "registry_enabled": false, "resolvconf_mode": "host_resolvconf", "retry_stagger": 5, "skydns_server": "10.233.0.3", "skydns_server_secondary": "10.233.0.4", "unsafe_show_logs": false, "volume_cross_zone_attachment": false } } node2 | SUCCESS => { "hostvars[inventory_hostname]": { "ansible_check_mode": false, "ansible_config_file": "/root/kubespray/ansible.cfg", "ansible_diff_mode": false, "ansible_facts": {}, "ansible_forks": 5, "ansible_host": "192.168.50.122", "ansible_inventory_sources": [ "/root/kubespray/inventory/mycluster/inventory.ini" ], "ansible_playbook_python": "/usr/bin/python3", "ansible_verbosity": 0, "ansible_version": { "full": "2.12.5", "major": 2, "minor": 12, "revision": 5, "string": "2.12.5" }, "argocd_enabled": false, "auto_renew_certificates": false, "bin_dir": "/usr/local/bin", "calico_cni_name": "k8s-pod-network", "calico_pool_blocksize": 26, "cephfs_provisioner_enabled": false, "cert_manager_enabled": false, "cluster_name": "cluster.local", "container_manager": "containerd", "coredns_k8s_external_zone": "k8s_external.local", "credentials_dir": "/root/kubespray/inventory/mycluster/credentials", "default_kubelet_config_dir": "/etc/kubernetes/dynamic_kubelet_dir", "deploy_netchecker": false, "dns_domain": "cluster.local", "dns_mode": "coredns", "docker_bin_dir": "/usr/bin", "docker_container_storage_setup": false, "docker_daemon_graph": "/var/lib/docker", "docker_dns_servers_strict": false, "docker_iptables_enabled": "false", "docker_log_opts": "--log-opt max-size=50m --log-opt max-file=5", "docker_rpm_keepcache": 1, "enable_coredns_k8s_endpoint_pod_names": false, "enable_coredns_k8s_external": false, "enable_dual_stack_networks": false, "enable_nat_default_gateway": true, "enable_nodelocaldns": false, "enable_nodelocaldns_secondary": false, "etcd_data_dir": "/var/lib/etcd", "etcd_deployment_type": "host", "event_ttl_duration": "1h0m0s", "group_names": [ "k8s_cluster", "kube_node" ], "groups": { "all": [ "master0", "node1", "node2" ], "calico_rr": [], "etcd": [ "master0" ], "k8s_cluster": [ "master0", "node1", "node2" ], "kube_control_plane": [ "master0" ], "kube_node": [ "node1", "node2" ], "ungrouped": [] }, "helm_enabled": false, "ingress_alb_enabled": false, "ingress_nginx_enabled": false, "ingress_publish_status_address": "", "inventory_dir": "/root/kubespray/inventory/mycluster", "inventory_file": "/root/kubespray/inventory/mycluster/inventory.ini", "inventory_hostname": "node2", "inventory_hostname_short": "node2", "ip": "192.168.50.122", "k8s_image_pull_policy": "IfNotPresent", "kata_containers_enabled": false, "krew_enabled": false, "krew_root_dir": "/usr/local/krew", "kube_api_anonymous_auth": true, "kube_apiserver_ip": "10.233.0.1", "kube_apiserver_port": 6443, "kube_cert_dir": "/etc/kubernetes/ssl", "kube_cert_group": "kube-cert", "kube_config_dir": "/etc/kubernetes", "kube_encrypt_secret_data": false, "kube_log_level": 2, "kube_manifest_dir": "/etc/kubernetes/manifests", "kube_network_node_prefix": 24, "kube_network_node_prefix_ipv6": 120, "kube_network_plugin": "calico", "kube_network_plugin_multus": false, "kube_ovn_default_gateway_check": true, "kube_ovn_default_logical_gateway": false, "kube_ovn_default_vlan_id": 100, "kube_ovn_dpdk_enabled": false, "kube_ovn_enable_external_vpc": true, "kube_ovn_enable_lb": true, "kube_ovn_enable_np": true, "kube_ovn_enable_ssl": false, "kube_ovn_encap_checksum": true, "kube_ovn_external_address": "8.8.8.8", "kube_ovn_external_address_ipv6": "2400:3200::1", "kube_ovn_external_dns": "alauda.cn", "kube_ovn_hw_offload": false, "kube_ovn_network_type": "geneve", "kube_ovn_node_switch_cidr": "100.64.0.0/16", "kube_ovn_node_switch_cidr_ipv6": "fd00:100:64::/64", "kube_ovn_pod_nic_type": "veth_pair", "kube_ovn_traffic_mirror": false, "kube_ovn_tunnel_type": "geneve", "kube_ovn_vlan_name": "product", "kube_owner": "kube", "kube_pods_subnet": "10.233.64.0/18", "kube_pods_subnet_ipv6": "fd85:ee78:d8a6:8607::1:0000/112", "kube_proxy_mode": "iptables", "kube_proxy_nodeport_addresses": [], "kube_proxy_strict_arp": true, "kube_script_dir": "/usr/local/bin/kubernetes-scripts", "kube_service_addresses": "10.233.0.0/18", "kube_service_addresses_ipv6": "fd85:ee78:d8a6:8607::1000/116", "kube_token_dir": "/etc/kubernetes/tokens", "kube_version": "v1.25.5", "kube_webhook_token_auth": false, "kube_webhook_token_auth_url_skip_tls_verify": false, "kubeadm_certificate_key": "aafcdd1748c9accc1aaee3b4cf0aebdb4e0f052760f5d017ec581df2b9635c7d", "kubeadm_patches": { "dest_dir": "/etc/kubernetes/patches", "enabled": false, "source_dir": "/root/kubespray/inventory/mycluster/patches" }, "kubernetes_audit": false, "loadbalancer_apiserver_healthcheck_port": 8081, "loadbalancer_apiserver_port": 6443, "local_path_provisioner_enabled": false, "local_release_dir": "/tmp/releases", "local_volume_provisioner_enabled": false, "macvlan_interface": "eth1", "metallb_enabled": false, "metallb_speaker_enabled": false, "metrics_server_enabled": true, "ndots": 2, "no_proxy_exclude_workers": false, "nodelocaldns_bind_metrics_host_ip": false, "nodelocaldns_health_port": 9254, "nodelocaldns_ip": "169.254.25.10", "nodelocaldns_second_health_port": 9256, "nodelocaldns_secondary_skew_seconds": 5, "ntp_enabled": false, "ntp_manage_config": false, "ntp_servers": [ "0.pool.ntp.org iburst", "1.pool.ntp.org iburst", "2.pool.ntp.org iburst", "3.pool.ntp.org iburst" ], "omit": "omit_place_holdered2aedc52f45dc99038ab78567807d04d1b029c6", "persistent_volumes_enabled": false, "playbook_dir": "/root/kubespray", "podsecuritypolicy_enabled": false, "rbd_provisioner_enabled": false, "registry_enabled": false, "resolvconf_mode": "host_resolvconf", "retry_stagger": 5, "skydns_server": "10.233.0.3", "skydns_server_secondary": "10.233.0.4", "unsafe_show_logs": false, "volume_cross_zone_attachment": false } }

marekk1717 commented 1 year ago

I've got the same issue on Ubuntu 22.04. Do we have any workarounds?

rufy2022 commented 1 year ago

@marekk1717 my workaround is to select cni instead of calico and then install calico manually from yaml. In file: group_vars/k8s_cluster/k8s-cluster.yml kube_network_plugin: cni

https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml

marekk1717 commented 1 year ago

Thx rufy2022. Is there any way to remove callico on a working cluster and add it manually? What do you think about weave instead of callico? I need it on my test cluster on vmware only. Non-production only.

rufy2022 commented 1 year ago

I tried the same, was trying to replace with flannel, but same behavior networking. The ansible playbook is doing some additional network setup thats why it has wrong route. I would recommend to do a fresh installation with cni and afterwards calico installation.

marekk1717 commented 1 year ago

It must be something wrong with Debian/Ubuntu. I switched to Rocky Linux 8.7 and it works with the same config as on Ubuntu.

rufy2022 commented 1 year ago

fyi. @kerryeon @oomichi please fix it ;)

aussielunix commented 1 year ago

FYI Ubuntu 22.04 with Kubespray branch release-2.21

I am hitting this too if I put FQDNs in the inventory. But pods can ping across nodes if the inventory only uses hostnames.

[all]
-k8s-master-1.example.com ansible_host=10.0.10.21
-k8s-master-2.example.com ansible_host=10.0.10.22
-k8s-master-3.example.com ansible_host=10.0.10.23
-k8s-node-1.example.com ansible_host=10.0.10.31
-k8s-node-2.example.com ansible_host=10.0.10.32
+k8s-master-1 ansible_host=10.0.10.21
+k8s-master-2 ansible_host=10.0.10.22
+k8s-master-3 ansible_host=10.0.10.23
+k8s-node-1 ansible_host=10.0.10.31
+k8s-node-2 ansible_host=10.0.10.32
...
...
...
aussielunix commented 1 year ago

I spoke too soon. I have deleted and built 4 times since the above and it fails again.

jonny-ervine commented 1 year ago

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: calico_network_backend: bird calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

gurmsc5 commented 1 year ago

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: calico_network_backend: bird calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

Thank you! I've been dealing with DNS resolution issues for several days and this resolved it (all my nodes are on Ubuntu 22.04)

VannTen commented 8 months ago

this looks like #10436 which was recently fixed. /close feel free to reopen if it's in fact a different bug

k8s-ci-robot commented 8 months ago

@VannTen: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kubespray/issues/9601#issuecomment-1863002149): >this looks like #10436 which was recently fixed. >/close >feel free to reopen if it's in fact a different bug Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
mrzysztof commented 2 months ago

Issue still remains when deploying on ubuntu 22.04 nodes.

Solution provided by @jonny-ervine does the trick, but deployment does not work fully with the defaults(and personally took some time to figure it out) Perhaps ipip should be the default for debian/ubuntu?

korallo159 commented 2 months ago

This looks like an issue in the vxlan component of Debian ... if you try redeploying kubernetes via kubespray and set the calico variables: calico_network_backend: bird calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

These variables are set in the inventory group_vars/k8s_cluster/k8s-net-calico.yml file.

You can also change it in working kubernetes cluster, you don't have to redeploy.

Change in calico_node in configmap k edit cm calico-config -n kube-system calico_network_backend: bird

Change in default ippool k edit ippool default-pool calico_ipip_mode: 'Always' calico_vxlan_mode: 'Never'

ant31 commented 2 months ago

@mrzysztof you can propose a PR or open issue to discuss the defaults

VannTen commented 1 week ago

Solution provided by @jonny-ervine does the trick, but deployment does not work fully with the defaults(and personally took some time to figure it out) Perhaps ipip should be the default for debian/ubuntu?

We have already way too much distro specific stuff. Let's identify the exact problem instead, then we'll see if there is a workaround (or a fix, if the issue is not in calico / debian)