Open umarfarooq-git opened 5 months ago
Hi @umarfarooq-git, sorry to hear about the troubles you have been having. Could you share the calico/vpp ds logs?
kubectl logs -n calico-vpp-dataplane calico-vpp-node-XYZ -c vpp
@onong, Thank you for your response. logs of cali-vpp-node are as below
vagrant@kubemaster:~$ sudo kubectl logs calico-vpp-node-2p6kj -n calico-vpp-dataplane -c vpp
time="2024-04-01T02:51:37Z" level=info msg="Version info\nImage tag : ab81a775fbdeba932888690c68ddf7e9f4bd8d2b\nVPP-dataplane version : ab81a77 Release v3.27.0\nVPP Version : 24.02-rc0~8-g9db45f6ae\nBinapi-generator version : v0.8.0\nVPP Base commit : 06efd532e gerrit:34726/3 interface: add buffer stats api\n------------------ Cherry picked commits --------------------\ncapo: Calico Policies plugin\nacl: acl-plugin custom policies\ncnat: [WIP] no k8s maglev from pods\npbl: Port based balancer\ngerrit:40078/3 vnet: allow format deleted swifidx\ngerrit:40090/3 cnat: undo fib_entry_contribute_forwarding\ngerrit:39507/13 cnat: add flow hash config to cnat translation\ngerrit:34726/3 interface: add buffer stats api\n-------------------------------------------------------------\n"
time="2024-04-01T02:51:37Z" level=info msg="Config:NODENAME=kubemaster"
time="2024-04-01T02:51:37Z" level=info msg="Config:SERVICE_PREFIX=[10.96.0.0/12]"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_NATIVE_DRIVER="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_INIT_SCRIPT_TEMPLATE="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_HOOK_BEFORE_IF_READ=#!/bin/sh\n\nHOOK=\"$0\"\nchroot /host /bin/sh <<EOSCRIPT\n\nfix_dns () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; fixing dns...\"\n sed -i \"s/\[main\]/\[main\]\ndns=none/\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nundo_dns_fix () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; undoing dns fix...\"\n sed -i \"0,/dns=none/{/dns=none/d;}\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nrestart_network () {\n if systemctl status systemd-networkd > /dev/null 2>&1; then\n echo \"default_hook: system is using systemd-networkd; restarting...\"\n systemctl restart systemd-networkd\n elif systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; restarting...\"\n systemctl restart NetworkManager\n elif systemctl status networking > /dev/null 2>&1; then\n echo \"default_hook: system is using networking service; restarting...\"\n systemctl restart networking\n elif systemctl status network > /dev/null 2>&1; then\n echo \"default_hook: system is using network service; restarting...\"\n systemctl restart network\n else\n echo \"default_hook: Networking backend not detected, network configuration may fail\"\n fi\n}\n\nif which systemctl > /dev/null; then\n echo \"default_hook: using systemctl...\"\nelse\n echo \"default_hook: Init system not supported, network configuration may fail\"\n exit 1\nfi\n\nif [ \"$HOOK\" = \"BEFORE_VPP_RUN\" ]; then\n fix_dns\nelif [ \"$HOOK\" = \"VPP_RUNNING\" ]; then\n restart_network\nelif [ \"$HOOK\" = \"VPP_DONE_OK\" ]; then\n undo_dns_fix\n restart_network\nelif [ \"$HOOK\" = \"VPP_ERRORED\" ]; then\n undo_dns_fix\n restart_network\nfi\n\nEOSCRIPT\n"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_FEATURE_GATES={}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_LOG_FORMAT="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_CONFIG_TEMPLATE=unix {\n nodaemon\n full-coredump\n cli-listen /var/run/vpp/cli.sock\n pidfile /run/vpp/vpp.pid\n exec /etc/vpp/startup.exec\n}\napi-trace { on }\ncpu {\n workers 0\n}\nsocksvr {\n socket-name /var/run/vpp/vpp-api.sock\n}\nplugins {\n plugin default { enable }\n plugin dpdk_plugin.so { disable }\n plugin calico_plugin.so { enable }\n plugin ping_plugin.so { disable }\n plugin dispatch_trace_plugin.so { enable }\n}\nbuffers {\n buffers-per-numa 131072\n}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_HOOK_VPP_ERRORED=#!/bin/sh\n\nHOOK=\"$0\"\nchroot /host /bin/sh <<EOSCRIPT\n\nfix_dns () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; fixing dns...\"\n sed -i \"s/\[main\]/\[main\]\ndns=none/\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nundo_dns_fix () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; undoing dns fix...\"\n sed -i \"0,/dns=none/{/dns=none/d;}\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nrestart_network () {\n if systemctl status systemd-networkd > /dev/null 2>&1; then\n echo \"default_hook: system is using systemd-networkd; restarting...\"\n systemctl restart systemd-networkd\n elif systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; restarting...\"\n systemctl restart NetworkManager\n elif systemctl status networking > /dev/null 2>&1; then\n echo \"default_hook: system is using networking service; restarting...\"\n systemctl restart networking\n elif systemctl status network > /dev/null 2>&1; then\n echo \"default_hook: system is using network service; restarting...\"\n systemctl restart network\n else\n echo \"default_hook: Networking backend not detected, network configuration may fail\"\n fi\n}\n\nif which systemctl > /dev/null; then\n echo \"default_hook: using systemctl...\"\nelse\n echo \"default_hook: Init system not supported, network configuration may fail\"\n exit 1\nfi\n\nif [ \"$HOOK\" = \"BEFORE_VPP_RUN\" ]; then\n fix_dns\nelif [ \"$HOOK\" = \"VPP_RUNNING\" ]; then\n restart_network\nelif [ \"$HOOK\" = \"VPP_DONE_OK\" ]; then\n undo_dns_fix\n restart_network\nelif [ \"$HOOK\" = \"VPP_ERRORED\" ]; then\n undo_dns_fix\n restart_network\nfi\n\nEOSCRIPT\n"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_SWAP_DRIVER="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_CONFIG_EXEC_TEMPLATE="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_IPSEC_IKEV2_PSK="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_DEBUG={}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_INTERFACES={\n \"defaultPodIfSpec\": {\n \"rx\": 1,\n \"tx\": 1,\n \"rxqsz\": 0,\n \"txqsz\": 0,\n \"isl3\": true,\n \"rxMode\": 0\n },\n \"maxPodIfSpec\": {\n \"rx\": 10,\n \"tx\": 10,\n \"rxqsz\": 1024,\n \"txqsz\": 1024,\n \"isl3\": null,\n \"rxMode\": 0\n },\n \"vppHostTapSpec\": {\n \"rx\": 1,\n \"tx\": 1,\n \"rxqsz\": 1024,\n \"txqsz\": 1024,\n \"isl3\": false,\n \"rxMode\": 0\n },\n \"uplinkInterfaces\": [\n {\n \"rx\": 0,\n \"tx\": 0,\n \"rxqsz\": 0,\n \"txqsz\": 0,\n \"isl3\": null,\n \"rxMode\": 0,\n \"isMain\": false,\n \"physicalNetworkName\": \"\",\n \"interfaceName\": \"enp0s8\",\n \"vppDriver\": \"\",\n \"newDriver\": \"\",\n \"annotations\": null,\n \"mtu\": 0\n }\n ]\n}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_IPSEC={\n \"nbAsyncCryptoThreads\": 0,\n \"extraAddresses\": 0\n}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_SRV6={\n \"localsidPool\": \"\",\n \"policyPool\": \"\"\n}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_INITIAL_CONFIG={\n \"vppStartupSleepSeconds\": 1,\n \"corePattern\": \"/var/lib/vpp/vppcore.%e.%p\",\n \"extraAddrCount\": 0,\n \"ifConfigSavePath\": \"\",\n \"defaultGWs\": \"\",\n \"redirectToHostRules\": null\n}"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_HOOK_VPP_RUNNING=#!/bin/sh\n\nHOOK=\"$0\"\nchroot /host /bin/sh <<EOSCRIPT\n\nfix_dns () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; fixing dns...\"\n sed -i \"s/\[main\]/\[main\]\ndns=none/\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nundo_dns_fix () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; undoing dns fix...\"\n sed -i \"0,/dns=none/{/dns=none/d;}\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nrestart_network () {\n if systemctl status systemd-networkd > /dev/null 2>&1; then\n echo \"default_hook: system is using systemd-networkd; restarting...\"\n systemctl restart systemd-networkd\n elif systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; restarting...\"\n systemctl restart NetworkManager\n elif systemctl status networking > /dev/null 2>&1; then\n echo \"default_hook: system is using networking service; restarting...\"\n systemctl restart networking\n elif systemctl status network > /dev/null 2>&1; then\n echo \"default_hook: system is using network service; restarting...\"\n systemctl restart network\n else\n echo \"default_hook: Networking backend not detected, network configuration may fail\"\n fi\n}\n\nif which systemctl > /dev/null; then\n echo \"default_hook: using systemctl...\"\nelse\n echo \"default_hook: Init system not supported, network configuration may fail\"\n exit 1\nfi\n\nif [ \"$HOOK\" = \"BEFORE_VPP_RUN\" ]; then\n fix_dns\nelif [ \"$HOOK\" = \"VPP_RUNNING\" ]; then\n restart_network\nelif [ \"$HOOK\" = \"VPP_DONE_OK\" ]; then\n undo_dns_fix\n restart_network\nelif [ \"$HOOK\" = \"VPP_ERRORED\" ]; then\n undo_dns_fix\n restart_network\nfi\n\nEOSCRIPT\n"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_HOOK_VPP_DONE_OK=#!/bin/sh\n\nHOOK=\"$0\"\nchroot /host /bin/sh <<EOSCRIPT\n\nfix_dns () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; fixing dns...\"\n sed -i \"s/\[main\]/\[main\]\ndns=none/\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nundo_dns_fix () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; undoing dns fix...\"\n sed -i \"0,/dns=none/{/dns=none/d;}\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nrestart_network () {\n if systemctl status systemd-networkd > /dev/null 2>&1; then\n echo \"default_hook: system is using systemd-networkd; restarting...\"\n systemctl restart systemd-networkd\n elif systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; restarting...\"\n systemctl restart NetworkManager\n elif systemctl status networking > /dev/null 2>&1; then\n echo \"default_hook: system is using networking service; restarting...\"\n systemctl restart networking\n elif systemctl status network > /dev/null 2>&1; then\n echo \"default_hook: system is using network service; restarting...\"\n systemctl restart network\n else\n echo \"default_hook: Networking backend not detected, network configuration may fail\"\n fi\n}\n\nif which systemctl > /dev/null; then\n echo \"default_hook: using systemctl...\"\nelse\n echo \"default_hook: Init system not supported, network configuration may fail\"\n exit 1\nfi\n\nif [ \"$HOOK\" = \"BEFORE_VPP_RUN\" ]; then\n fix_dns\nelif [ \"$HOOK\" = \"VPP_RUNNING\" ]; then\n restart_network\nelif [ \"$HOOK\" = \"VPP_DONE_OK\" ]; then\n undo_dns_fix\n restart_network\nelif [ \"$HOOK\" = \"VPP_ERRORED\" ]; then\n undo_dns_fix\n restart_network\nfi\n\nEOSCRIPT\n"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_LOG_LEVEL=info"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_BGP_LOG_LEVEL=INFO"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_GRACEFUL_SHUTDOWN_TIMEOUT=10s"
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_INTERFACE="
time="2024-04-01T02:51:37Z" level=info msg="Config:CALICOVPP_HOOK_BEFORE_VPP_RUN=#!/bin/sh\n\nHOOK=\"$0\"\nchroot /host /bin/sh <<EOSCRIPT\n\nfix_dns () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; fixing dns...\"\n sed -i \"s/\[main\]/\[main\]\ndns=none/\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nundo_dns_fix () {\n if systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; undoing dns fix...\"\n sed -i \"0,/dns=none/{/dns=none/d;}\" /etc/NetworkManager/NetworkManager.conf\n systemctl daemon-reload\n systemctl restart NetworkManager\n fi\n}\n\nrestart_network () {\n if systemctl status systemd-networkd > /dev/null 2>&1; then\n echo \"default_hook: system is using systemd-networkd; restarting...\"\n systemctl restart systemd-networkd\n elif systemctl status NetworkManager > /dev/null 2>&1; then\n echo \"default_hook: system is using NetworkManager; restarting...\"\n systemctl restart NetworkManager\n elif systemctl status networking > /dev/null 2>&1; then\n echo \"default_hook: system is using networking service; restarting...\"\n systemctl restart networking\n elif systemctl status network > /dev/null 2>&1; then\n echo \"default_hook: system is using network service; restarting...\"\n systemctl restart network\n else\n echo \"default_hook: Networking backend not detected, network configuration may fail\"\n fi\n}\n\nif which systemctl > /dev/null; then\n echo \"default_hook: using systemctl...\"\nelse\n echo \"default_hook: Init system not supported, network configuration may fail\"\n exit 1\nfi\n\nif [ \"$HOOK\" = \"BEFORE_VPP_RUN\" ]; then\n fix_dns\nelif [ \"$HOOK\" = \"VPP_RUNNING\" ]; then\n restart_network\nelif [ \"$HOOK\" = \"VPP_DONE_OK\" ]; then\n undo_dns_fix\n restart_network\nelif [ \"$HOOK\" = \"VPP_ERRORED\" ]; then\n undo_dns_fix\n restart_network\nfi\n\nEOSCRIPT\n"
time="2024-04-01T02:51:37Z" level=info msg="-- Environment --"
time="2024-04-01T02:51:37Z" level=info msg="Hugepages 512"
time="2024-04-01T02:51:37Z" level=info msg="KernelVersion 4.15.0-212"
time="2024-04-01T02:51:37Z" level=info msg="Drivers map[uio_pci_generic:false vfio-pci:true]"
time="2024-04-01T02:51:37Z" level=info msg="initial iommu status N"
time="2024-04-01T02:51:37Z" level=info msg="-- Interface Spec --"
time="2024-04-01T02:51:37Z" level=info msg="Interface Name: enp0s8"
time="2024-04-01T02:51:37Z" level=info msg="Native Driver: "
time="2024-04-01T02:51:37Z" level=info msg="New Drive Name: "
time="2024-04-01T02:51:37Z" level=info msg="PHY target #Queues rx:0 tx:0"
time="2024-04-01T02:51:37Z" level=info msg="Tap MTU: 0"
time="2024-04-01T02:51:37Z" level=info msg="-- Interface config --"
time="2024-04-01T02:51:37Z" level=info msg="Node IP4: 192.168.56.2/24"
time="2024-04-01T02:51:37Z" level=info msg="Node IP6: "
time="2024-04-01T02:51:37Z" level=info msg="PciId: 0000:00:08.0"
time="2024-04-01T02:51:37Z" level=info msg="Driver: e1000"
time="2024-04-01T02:51:37Z" level=info msg="Linux IF was up ? true"
time="2024-04-01T02:51:37Z" level=info msg="Promisc was on ? true"
time="2024-04-01T02:51:37Z" level=info msg="DoSwapDriver: false"
time="2024-04-01T02:51:37Z" level=info msg="Mac: 08:00:27:10:1e:ec"
time="2024-04-01T02:51:37Z" level=info msg="Addresses: [192.168.56.2/24 enp0s8,fe80::a00:27ff:fe10:1eec/64]"
time="2024-04-01T02:51:37Z" level=info msg="Routes: [{Ifindex: 3 Dst: fe80::/64 Src:
You have enp0s8 and enp0s9 configured with ip addrs in the same subnet, 192.168.56.0/24. That might be causing confusion in the routing. Bringing down enp0s9 might be a good idea among other things.
Secondly, the --pod-network-cidr=192.168.0.0/16 overlaps with the subnet used by enp0s8/9. Pls use a different cidr.
@onong After spending hours, I got it working. Thank you very much.
Solution:
I have another question which isn't directly aligned with the issue, Would be greatful for your response.
Does overall Calico VPP (particularly VPP's driver DPDK) works with kubevirt If we want to accelerate network traffic of VM's inside K8S cluster..! Unfortunately unable to find any related docs.
Could you elaborate on what you mean by "we want to accelerate network traffic of VM's inside K8S cluster."?
I hit a similar issue in my K8s cluster that uses Azure Compute instances and CentOS 8. I have NetworkManager managing networking on the hosts.
Environment:
eth0
interfaceWhen I deploy a netshoot
pod right after I applied Calico installation and vpp manifests, the pod gets networked once calico-node
came up and I get full DNS resolution from within that pod. However, any pods that I deploy after Calico VPP is fully initialized can't seem to reach the kube DNS service.
Here's what I get from a netshoot
pod that was deployed before Calico VPP was fully initialized:
<<K9s-Shell>> Pod: default/netshoot | Container: netshoot
netshoot:~# nslookup kuberenetes
Server: 10.96.0.10
Address: 10.96.0.10#53
** server can't find kuberenetes: NXDOMAIN
netshoot:~# nslookup nginx-svc.uat
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: nginx-svc.uat.svc.cluster.local
Address: 10.103.29.82
netshoot:~# nslookup google.com
Server: 10.96.0.10
Address: 10.96.0.10#53
Non-authoritative answer:
Name: google.com
Address: 142.251.211.238
Name: google.com
Address: 2607:f8b0:400a:80b::200e
netshoot:~# nc -zvw2 10.96.0.10 53
Connection to 10.96.0.10 53 port [tcp/domain] succeeded!
Here's what I get from a netshoot2
pod that was deployed after Calico VPP was fully initialized:
<<K9s-Shell>> Pod: default/netshoot2 | Container: netshoot2
netshoot2:~# nslookup kuberentes
;; communications error to 10.96.0.10#53: timed out
netshoot2:~# nslookup nginx-svc.uat
;; communications error to 10.96.0.10#53: timed out
netshoot2:~# nslookup google.com
;; communications error to 10.96.0.10#53: timed out
netshoot2:~# nc -zvw2 10.96.0.10 53
nc: connect to 10.96.0.10 port 53 (tcp) timed out: Operation in progress
I restarted one of my coredns
pods, and I see these logs in it
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout
Here's the NetworkManager logs from one of the hosts
-- Logs begin at Fri 2024-04-05 17:04:39 UTC, end at Fri 2024-04-05 18:01:59 UTC. --
Apr 05 17:04:59 master systemd[1]: Starting Network Manager...
Apr 05 17:04:59 master NetworkManager[922]: <info> [1712336699.9329] NetworkManager (version 1.32.10-4.el8) is starting... (for the first time)
Apr 05 17:04:59 master NetworkManager[922]: <info> [1712336699.9335] Read config: /etc/NetworkManager/NetworkManager.conf (etc: 99-dhcp-timeout.conf)
Apr 05 17:04:59 master systemd[1]: Started Network Manager.
Apr 05 17:04:59 master NetworkManager[922]: <info> [1712336699.9658] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Apr 05 17:04:59 master NetworkManager[922]: <info> [1712336699.9984] manager[0x5585b1af5040]: monitoring kernel firmware directory '/lib/firmware'.
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0032] hostname: hostname: using hostnamed
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0032] hostname: hostname changed from (none) to "master"
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0037] dns-mgr[0x5585b1ad8250]: init: dns=default,systemd-resolved rc-manager=symlink
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0149] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-device-plugin-team.so)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0150] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0152] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0153] manager: Networking is enabled by state file
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0162] dhcp-init: Using DHCP client 'internal'
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0176] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-settings-plugin-ifcfg-rh.so")
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0176] settings: Loaded settings plugin: keyfile (internal)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0279] device (lo): carrier: link connected
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0326] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0416] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0480] device (eth0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.0986] device (eth0): carrier: link connected
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1125] device (eth0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1135] policy: auto-activating connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1142] device (eth0): Activation: starting connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1154] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1171] manager: NetworkManager state is now CONNECTING
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1180] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1199] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1214] dhcp4 (eth0): activation: beginning transaction (timeout in 300 seconds)
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1560] dhcp4 (eth0): state changed unknown -> bound, address=172.10.1.5
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1577] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1742] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1746] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1752] manager: NetworkManager state is now CONNECTED_LOCAL
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1757] manager: NetworkManager state is now CONNECTED_SITE
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1758] policy: set 'System eth0' (eth0) as default for IPv4 routing and DNS
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1771] device (eth0): Activation: successful, device activated.
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1780] manager: NetworkManager state is now CONNECTED_GLOBAL
Apr 05 17:05:00 master NetworkManager[922]: <info> [1712336700.1849] manager: startup complete
Apr 05 17:05:16 master systemd[1]: Reloading Network Manager.
Apr 05 17:05:17 master NetworkManager[922]: <info> [1712336717.0440] audit: op="reload" arg="0" pid=1862 uid=0 result="success"
Apr 05 17:05:17 master NetworkManager[922]: <info> [1712336717.0449] config: signal: SIGHUP (no changes from disk)
Apr 05 17:05:17 master systemd[1]: Reloaded Network Manager.
Apr 05 17:57:07 master systemd[1]: Stopping Network Manager...
Apr 05 17:57:07 master NetworkManager[922]: <info> [1712339827.2388] caught SIGTERM, shutting down normally.
Apr 05 17:57:07 master NetworkManager[922]: <info> [1712339827.2464] dhcp4 (eth0): canceled DHCP transaction
Apr 05 17:57:07 master NetworkManager[922]: <info> [1712339827.2465] dhcp4 (eth0): state changed bound -> terminated
Apr 05 17:57:07 master NetworkManager[922]: <info> [1712339827.2467] manager: NetworkManager state is now CONNECTED_SITE
Apr 05 17:57:07 master NetworkManager[922]: <info> [1712339827.2768] exiting (success)
Apr 05 17:57:07 master systemd[1]: NetworkManager.service: Succeeded.
Apr 05 17:57:07 master systemd[1]: Stopped Network Manager.
Apr 05 17:57:07 master systemd[1]: Starting Network Manager...
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.3291] NetworkManager (version 1.32.10-4.el8) is starting... (after a restart)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.3292] Read config: /etc/NetworkManager/NetworkManager.conf (etc: 99-dhcp-timeout.conf)
Apr 05 17:57:07 master systemd[1]: Started Network Manager.
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.3312] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.3422] manager[0x5571c321d0a0]: monitoring kernel firmware directory '/lib/firmware'.
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5734] hostname: hostname: using hostnamed
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5737] hostname: hostname changed from (none) to "master"
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5741] dns-mgr[0x5571c31ff250]: init: dns=none,systemd-resolved rc-manager=unmanaged
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5784] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-device-plugin-team.so)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5785] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5785] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5787] manager: Networking is enabled by state file
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5788] dhcp-init: Using DHCP client 'internal'
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5795] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-settings-plugin-ifcfg-rh.so")
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5824] settings: Loaded settings plugin: keyfile (internal)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5856] device (lo): carrier: link connected
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5860] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5871] device (eth0): carrier: link connected
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5879] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5896] manager: (eth0): assume: will attempt to assume matching connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03) (indicated)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5897] device (eth0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5905] device (eth0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5938] device (eth0): Activation: starting connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5960] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5965] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5968] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.5972] dhcp4 (eth0): activation: beginning transaction (timeout in 300 seconds)
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6368] dhcp4 (eth0): state changed unknown -> bound, address=172.10.1.5
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6422] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6470] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6473] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6479] manager: NetworkManager state is now CONNECTED_LOCAL
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6487] manager: NetworkManager state is now CONNECTED_SITE
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6489] policy: set 'System eth0' (eth0) as default for IPv4 routing and DNS
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6497] device (eth0): Activation: successful, device activated.
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6506] manager: NetworkManager state is now CONNECTED_GLOBAL
Apr 05 17:57:07 master NetworkManager[40556]: <info> [1712339827.6511] manager: startup complete
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.3861] device (eth0): state change: activated -> unmanaged (reason 'removed', sys-iface-state: 'removed')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.3871] dhcp4 (eth0): canceled DHCP transaction
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.3871] dhcp4 (eth0): state changed bound -> terminated
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.3889] manager: NetworkManager state is now DISCONNECTED
Apr 05 17:57:10 master NetworkManager[40556]: <warn> [1712339830.3907] dns-sd-resolved[986b7b74fdcc1af0]: send-updates SetLinkDomains@2 failed: GDBus.Error:org.freedesktop.resolve1.NoSuchLink: Link 2 not known
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.5153] manager: (eth0): new Tun device (/org/freedesktop/NetworkManager/Devices/3)
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6193] device (eth0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6240] device (eth0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6249] device (eth0): Activation: starting connection 'eth0' (40c97394-9ec0-43b9-9948-67cc8534ed18)
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6271] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6274] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6277] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6278] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6317] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6321] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6330] manager: NetworkManager state is now CONNECTED_LOCAL
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6335] device (eth0): Activation: successful, device activated.
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.6342] manager: NetworkManager state is now CONNECTED_GLOBAL
Apr 05 17:57:10 master systemd[1]: Stopping Network Manager...
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.7324] caught SIGTERM, shutting down normally.
Apr 05 17:57:10 master NetworkManager[40556]: <info> [1712339830.7339] manager: NetworkManager state is now CONNECTED_LOCAL
Apr 05 17:57:11 master NetworkManager[40556]: <info> [1712339831.0554] exiting (success)
Apr 05 17:57:11 master systemd[1]: NetworkManager.service: Succeeded.
Apr 05 17:57:11 master systemd[1]: Stopped Network Manager.
Apr 05 17:57:11 master systemd[1]: Starting Network Manager...
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1293] NetworkManager (version 1.32.10-4.el8) is starting... (after a restart)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1294] Read config: /etc/NetworkManager/NetworkManager.conf (etc: 99-dhcp-timeout.conf)
Apr 05 17:57:11 master systemd[1]: Started Network Manager.
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1349] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1476] manager[0x5621ca5c4040]: monitoring kernel firmware directory '/lib/firmware'.
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1502] hostname: hostname: using hostnamed
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1502] hostname: hostname changed from (none) to "master"
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1507] dns-mgr[0x5621ca5a9250]: init: dns=none,systemd-resolved rc-manager=unmanaged
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1571] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-device-plugin-team.so)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1572] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1573] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1574] manager: Networking is enabled by state file
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1575] dhcp-init: Using DHCP client 'internal'
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1582] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.32.10-4.el8/libnm-settings-plugin-ifcfg-rh.so")
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1583] settings: Loaded settings plugin: keyfile (internal)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1617] device (lo): carrier: link connected
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1621] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1660] manager: (eth0): new Tun device (/org/freedesktop/NetworkManager/Devices/2)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1677] manager: (eth0): assume: will attempt to assume matching connection 'eth0' (40c97394-9ec0-43b9-9948-67cc8534ed18) (indicated)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1679] device (eth0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1688] device (eth0): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1717] device (eth0): Activation: starting connection 'eth0' (40c97394-9ec0-43b9-9948-67cc8534ed18)
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1742] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1746] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1749] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1836] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1849] device (eth0): ipv6: duplicate address check failed for the fe80::222:48ff:febc:738e/64 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 3 flags permanent,tentative src kernel address
Apr 05 17:57:11 master NetworkManager[40674]: <warn> [1712339831.1922] acd[0x5621ca6580f0,3]: conflict for address 172.10.1.5 detected with host 02:CA:11:C0:FD:00 on interface 'eth0'
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1924] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1928] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'assume')
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1934] manager: NetworkManager state is now CONNECTED_LOCAL
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1941] manager: NetworkManager state is now CONNECTED_SITE
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1942] policy: set 'eth0' (eth0) as default for IPv4 routing and DNS
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1947] device (eth0): Activation: successful, device activated.
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1957] manager: NetworkManager state is now CONNECTED_GLOBAL
Apr 05 17:57:11 master NetworkManager[40674]: <info> [1712339831.1969] manager: startup complete
I tried restarting the NetworkManager on the each host and then created netshoot3
pod, but has the same behavior as netshoot2
, i.e. no DNS resolution from within the pod.
I noticed that on both of my cluster hosts (master, worker1) I have these log entries in the NetworkManager logs
# on master host
Apr 05 17:57:11 master NetworkManager[40674]: <warn> [1712339831.1922] acd[0x5621ca6580f0,3]: conflict for address 172.10.1.5 detected with host 02:CA:11:C0:FD:00 on interface 'eth0'
# on worker1 host
Apr 05 22:40:36 worker1 NetworkManager[64135]: <warn> [1712356836.6727] acd[0x55a38f87a590,3]: conflict for address 172.10.1.4 detected with host 02:CA:11:C0:FD:00 on interface 'eth0'
Is this a typical message when VPP is taking over the interface or this could be an indicator of some other problem?
@onong I want to deploy Virtual Network Functions (VNFs) on VMs inside k8s cluster by using kubevirt. These VNFs have some dependencies like DPDK and SR-IOV. I am interested to provide DPDK support to VMs using Calico VPP plugin. But not sure, If it's really possible or not...
@onong Can you please look into the following issue. I again stuck with while configuring Calico VPP on the same platform but this time with 2 nodes.
Setup: 1 master node 1 worker node
Details: master node
worker node
Pod status
Problem1 First I was facing the same trouble, I mean CoreDNS pods and calico kube controller stuck in containercreating state. solution: Configured NetworkManager
Problem2 Now calico pods are running but contrlplane node like kube-controller-manager-kubemaster, kube-scheduler-kubemaster and tigera-operator-6bfc79cb9c-v2qcx run just momentry and then CrashLoopBackOff.
Tried solution but unsuccessful: Updated the schedular and manager static pod with following. command:
Logs: kube Manager
kube Manager
tigera-operator
Would greatful to have any clue about the trouble which I am facing. I am really confused about the 'Leader Election' error. I don't even have two master node.
@umarfarooq-git The logs seem to indicate that the apiserver is not responding. Could you share the apiserver logs?
@onong I want to deploy Virtual Network Functions (VNFs) on VMs inside k8s cluster by using kubevirt. These VNFs have some dependencies like DPDK and SR-IOV. I am interested to provide DPDK support to VMs using Calico VPP plugin. But not sure, If it's really possible or not...
Just so we are on the same page, in a Calico VPP cluster, the main/uplink interface is consumed by VPP using one of the supported uplink drivers (af_packet, DPDK etc) and the pods are presented with a tuntap interface.
By configuring Calico VPP to use DPDK (assuming that the NIC is DPDK supported), your pods/VNFs are indirectly using DPDK. But I guess thats not what you may have in mind :)
So, could you describe what's your usage scenario with the VNF and DPDK and SR-IOV in the above framework? Are you looking to setup another NIC for the VNF to be consumed by DPDK?
on master host
Apr 05 17:57:11 master NetworkManager[40674]:
[1712339831.1922] acd[0x5621ca6580f0,3]: conflict for address 172.10.1.5 detected with host 02:CA:11:C0:FD:00 on interface 'eth0' on worker1 host
Apr 05 22:40:36 worker1 NetworkManager[64135]:
[1712356836.6727] acd[0x55a38f87a590,3]: conflict for address 172.10.1.4 detected with host 02:CA:11:C0:FD:00 on interface 'eth0'
@ivansharamok, host with MAC 02:CA:11:C0:FD:00
seems to be causing the conflict on both the master and worker. DHCP misconfiguration? Maybe find the culprit host 02:CA:11:C0:FD:00
and shut it down?
host with MAC
02:CA:11:C0:FD:00
seems to be causing the conflict on both the master and worker. DHCP misconfiguration? Maybe find the culprit host02:CA:11:C0:FD:00
and shut it down?
There is no host with MAC 02:CA:11:C0:FD:00
in my setup. Below is ifconfig
output from 2 hosts which I'm using to test Calico VPP in my cluster.
# master host ifconfig output
[azureuser@master ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.10.1.4 netmask 255.255.255.0 broadcast 172.10.1.255
ether 00:22:48:b7:a1:9f txqueuelen 1000 (Ethernet)
RX packets 85927 bytes 647465868 (617.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 66197 bytes 57170711 (54.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 259210 bytes 147938219 (141.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 259210 bytes 147938219 (141.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
# worker1 host ifconfig output
[azureuser@worker1 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.10.1.5 netmask 255.255.255.0 broadcast 172.10.1.255
inet6 fe80::20d:3aff:fef6:4fbe prefixlen 64 scopeid 0x20<link>
ether 00:0d:3a:f6:4f:be txqueuelen 1000 (Ethernet)
RX packets 170488 bytes 805979391 (768.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 152307 bytes 161211067 (153.7 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 775 bytes 138868 (135.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 775 bytes 138868 (135.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I noticed today that on worker1 host, I get this warn
log entry in the NetworkManager log
Apr 08 23:30:32 worker1 NetworkManager[9055]: <warn> [1712619032.7995] dns-sd-resolved[43db7271bfe3a803]: send-updates SetLinkDomains@2 failed: GDBus.Error:org.freedesktop.resolve1.NoSuchLink: Link 2 not known
Do you know if Calico VPP was ever tested using Azure VMs? I'm starting to suspect that Azure VMs of Standard_D4s_v3
size may not like when VPP is trying to take over Azure managed primary interface on the VM.
@ivansharamok, we have not tested with Azure VMs afaik.
There is no host with MAC 02:CA:11:C0:FD:00 in my setup. Below is ifconfig output from 2 hosts which I'm using to test Calico VPP in my cluster.
What I meant was that there might be another node (with MAC 02:CA:11:C0:FD:00) in the subnet which is assigned the IP addrs belonging to the worker/master node? Try arping
and see if you get a response:
arping -I eth0 <master/worker IP addr>
I don't have any other nodes in the subnet. In my test environment I'm building all the resources from scratch with terraform. I have a dedicated VPC with only 2 Azure Compute instances within the VPC.
# arping on master node
[azureuser@master ~]$ arping -c2 -I eth0 172.10.1.5
ARPING 172.10.1.5 from 172.10.1.4 eth0
Unicast reply from 172.10.1.5 [12:34:56:78:9A:BC] 0.864ms
Unicast reply from 172.10.1.5 [12:34:56:78:9A:BC] 0.964ms
Sent 2 probes (1 broadcast(s))
Received 2 response(s)
# arping on worker1 node
[azureuser@worker1 ~]$ arping -c2 -I eth0 172.10.1.4
ARPING 172.10.1.4 from 172.10.1.5 eth0
Unicast reply from 172.10.1.4 [12:34:56:78:9A:BC] 1.031ms
Unicast reply from 172.10.1.4 [12:34:56:78:9A:BC] 1.139ms
Sent 2 probes (1 broadcast(s))
Received 2 response(s)
@onong Thank you for responding. Current issue which I was facing to setup Calico VPP got resolved and problem was the shortage of memory for the master node. Got clue from the following issue. https://stackoverflow.com/questions/75148975/leaderelections-failing-lease-unable-to-be-renewed-automatically
Regarding Calico VPP support for VMs inside k8s cluster. My goal is exactly what you mentioned. I have a device which has more than one NIC (with DPDK support). I want to run a VNF (VM based network function) on such a device using kubevirt inside K8S cluster. and want to use Calico VPP with DPDK drivers in order to accelrate network traffic for that VNF through one of the available NIC.
@umarfarooq-git you might want to go through the multinet
doc and see if it matches what you are looking for?
https://github.com/projectcalico/vpp-dataplane/blob/master/docs/multinet.md
@ivansharamok, sorry for the delayed response. The warning log msg around address conflict in the NM logs is probably ok given that azure networking differs from conventional networking somewhat. Sorry for making you chase that lead :(
However, if the network connectivity between the two nodes is ok then things should work fine. But like I mentioned earlier, we have not tested on azure so can't say for sure. The issue you are seeing is probably due to some quirk in azure networking.
Environment
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.8
Provision Master Nodes
(1..NUM_MASTER_NODE).each do |i| config.vm.define "kubemaster" do |node|
Name shown in the GUI
node.vm.provider "virtualbox" do |vb| vb.name = "kubemaster" vb.memory = 4096 vb.cpus = 4 end node.vm.hostname = "kubemaster" node.vm.network :private_network, ip: IP_NW + "#{MASTER_IP_START + i}" node.vm.network "forwarded_port", guest: 22, host: "#{2710 + i}" node.vm.network "private_network", ip: "192.168.56.10", virtualbox__hostonly: true node.vm.provision "setup-hosts", :type => "shell", :path => "ubuntu/vagrant/setup-hosts.sh" do |s| s.args = ["enp0s8"] end node.vm.provision "setup-dns", type: "shell", :path => "ubuntu/update-dns.sh" end end end
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF
sudo modprobe overlay sudo modprobe br_netfilter
sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF
Apply sysctl params without reboot
sudo sysctl --system
version = 2 [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true