Closed xelor81 closed 5 years ago
@xelor81 Can you please check if you have ebtables installed on the VM? I am assuming this is a vm running in Azure.
@sharmasushant I have checked and ebtables is present but there is no rules defined
jluzny-eastus2-coremaster1 ~ # ebtables -L
Bridge table: filter
Bridge chain: INPUT, entries: 0, policy: ACCEPT
Bridge chain: FORWARD, entries: 0, policy: ACCEPT
Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
- ebtables NAT table
jluzny-eastus2-coremaster1 ~ # ebtables -t nat -L
Bridge table: nat
Bridge chain: PREROUTING, entries: 0, policy: ACCEPT
Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT
Yes VM is located in azure cloud. CoreOS is taken from azure available image. Details of the image:
image_publisher = "CoreOS" image_offer = "CoreOS" image_sku = "Stable" image_version = "latest"
Word of comment I had just installed bare ubuntu 16.04 (using resource manager in azure with default settings) then I installed docker on it using doc this guide: https://docs.docker.com/cs-engine/1.13/
Later I created the systemd and all related components and it got me to the same error 255.
I also poke on the AKS node for comparison and it seems AKS node based on ubuntu 16.04 ebatles is also present and ebtables for NAT table are defined
root@aks-es-85070152-0:~# ebtables -t nat -L Bridge table: nat
Bridge chain: PREROUTING, entries: 15, policy: ACCEPT -p ARP -i eth0 --arp-op Reply -j dnat --to-dst ff:ff:ff:ff:ff:ff --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.50 -j arpreply --arpreply-mac 62:fc:29:12:f5:94 -p IPv4 -i eth0 --ip-dst 10.3.6.50 -j dnat --to-dst 62:fc:29:12:f5:94 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.53 -j arpreply --arpreply-mac 9a:c2:9e:dd:94:f7 -p IPv4 -i eth0 --ip-dst 10.3.6.53 -j dnat --to-dst 9a:c2:9e:dd:94:f7 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.36 -j arpreply --arpreply-mac 36:20:3f:33:ab:45 -p IPv4 -i eth0 --ip-dst 10.3.6.36 -j dnat --to-dst 36:20:3f:33:ab:45 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.41 -j arpreply --arpreply-mac 7a:3e:61:c5:c2:13 -p IPv4 -i eth0 --ip-dst 10.3.6.41 -j dnat --to-dst 7a:3e:61:c5:c2:13 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.51 -j arpreply --arpreply-mac 36:9c:5c:d2:93:a7 -p IPv4 -i eth0 --ip-dst 10.3.6.51 -j dnat --to-dst 36:9c:5c:d2:93:a7 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.37 -j arpreply --arpreply-mac fa:b4:3a:26:25:96 -p IPv4 -i eth0 --ip-dst 10.3.6.37 -j dnat --to-dst fa:b4:3a:26:25:96 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.40 -j arpreply --arpreply-mac 76:2e:aa:8e:df:64 -p IPv4 -i eth0 --ip-dst 10.3.6.40 -j dnat --to-dst 76:2e:aa:8e:df:64 --dnat-target ACCEPT
Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT -s Unicast -o eth0 -j snat --to-src 0:d:3a:3:4d:4a --snat-arp --snat-target ACCEPT
I had also find some details in journalctl which indicates that azure- cannot generate persistent MAC address for that interface.
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448798 1478 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "d8bcef579082154f8>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448842 1478 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e231a)" failed: rpc error>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448857 1478 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e231a)" failed: rpc erro>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448938 1478 pod_workers.go:186] Error syncing pod 8285d63e-d605-11e8-81d5-000d3a0e231a ("kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:38.450187 1478 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-dqdhp_kub>
Oct 23 12:52:39 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:39.656223 1478 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-dqdhp_kub>
Oct 23 12:52:39 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:39.722284 1478 pod_container_deletor.go:75] Container "d8bcef579082154f8623fecd251c170def074d47045ba614159eb387314cd427" not found in pod's containers
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:40.024821 1478 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "d8bcef579082154f8623fecd251c170d>
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] Telemetry sent with status code 200
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] SetReportState succeeded
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/c41d026ded44808f1e94a5ab95a689f61b3ef305db66908ca20bd0bd76a6b476/shim.sock" debu>
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=debug msg="registering ttrpc server"
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 23 12:52:43 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:43.765710 1478 pod_container_deletor.go:75] Container "c41d026ded44808f1e94a5ab95a689f61b3ef305db66908ca20bd0bd76a6b476" not found in pod's containers
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-udevd[41910]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-udevd[41910]: Could not generate persistent MAC address for azure0: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Lost carrier
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: DHCP lease lost
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot configure proxy NDP for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot configure IPv6 privacy extension for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot disable kernel IPv6 accept_ra for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: docker0: Link is not managed by us
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot enable IPv6 for interface azure0: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Gained carrier
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Link readded
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:44 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 23 12:52:44 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
After looking some answer on the error in journalctl log I applied a work-around https://github.com/naspersclassifieds-shared/coreos-kubernetes/commit/b18adcafabae0f20469c3530027a0f3aca09b5ba
and now the issue with the persistent MAC address are no longer visible in the logs however there are other ones to see.
Oct 29 08:54:45 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:54:45 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:55:15 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:55:15 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:55:45 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:55:45 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1083]: I1029 08:56:00.367225 1 controller.go:597] quota admission added evaluator for: {apps replicasets}
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1085]: I1029 08:56:00.369241 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"kube-system", Name:"kube-dns-v20", UID:"c1f07169-d76b-11e8-80c0-000d3a0e231a", API>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1085]: I1029 08:56:00.398902 1 event.go:221] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"kube-system", Name:"kube-dns-v20-596c55dc95", UID:"c1f993fe-d76b-11e8-80c0-000d3a0>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1138]: I1029 08:56:00.798363 1490 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-dns-kubelet" (UniqueName: "kubernetes.io/host-path/75280c1>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1138]: I1029 08:56:00.798500 1490 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-dns-token-z7q42" (UniqueName: "kubernetes.io/secret/75280c>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90/shim.sock" debu>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] File not exist /var/run/AzureCNITelemetry.json
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 GetReport state file didn't exist. Setting flag to true
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] &{IsNewInstance:true CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] SetReportState succeeded
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-udevd[2300]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: Network configuration changed, trying to establish connection.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained carrier
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: IPv6 successfully enabled
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:03 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] &{IsNewInstance:false CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage:Failed to create network: exit status 255 Context:AzureCNI SubContex>
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] SetReportState succeeded
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:03.900107 1490 cni.go:260] Error adding network: Failed to create network: exit status 255
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:03.900135 1490 cni.go:228] Error while adding to cni network: Failed to create network: exit status 255
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained IPv6LL
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] SetReportState succeeded
Oct 29 08:56:04 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:04Z" level=info msg="shim reaped" id=ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90
Oct 29 08:56:04 jluzny-eastus2-coremaster1 env[891]: time="2018-10-29T08:56:04.148924485Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280245 1490 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "ae53a77183a4845f0>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280316 1490 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc error>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280341 1490 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc erro>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280432 1490 pod_workers.go:186] Error syncing pod 75280c11-db58-11e8-a1bb-000d3a0e231a ("kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.543128 1490 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-6vgpx_kub>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.543989 1490 pod_container_deletor.go:75] Container "ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90" not found in pod's containers
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.846602 1490 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "ae53a77183a4845f0ef7ee0e0d8f6326>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] SetReportState succeeded
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b/shim.sock" debu>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:05.678641 1490 pod_container_deletor.go:75] Container "2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b" not found in pod's containers
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-udevd[2555]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot configure proxy NDP for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot configure IPv6 privacy extension for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot disable kernel IPv6 accept_ra for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot enable IPv6 for interface azure0: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained carrier
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Link readded
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] &{IsNewInstance:false CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage:Failed to create network: exit status 255 Context:AzureCNI SubContex>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] SetReportState succeeded
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.064461 1490 cni.go:260] Error adding network: Failed to create network: exit status 255
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.064500 1490 cni.go:228] Error while adding to cni network: Failed to create network: exit status 255
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] SetReportState succeeded
Oct 29 08:56:06 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:06Z" level=info msg="shim reaped" id=2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b
Oct 29 08:56:06 jluzny-eastus2-coremaster1 env[891]: time="2018-10-29T08:56:06.304017369Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517197 1490 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "2b69cdef7b72a0d3c>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517244 1490 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc error>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517278 1490 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc erro>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517549 1490 pod_workers.go:186] Error syncing pod 75280c11-db58-11e8-a1bb-000d3a0e231a ("kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:06.983542 1490 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "2b69cdef7b72a0d3ca52cead58751fd8>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] SetReportState succeeded
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/52917de9105251c40619b8e9960f247e9e7aa27c23df37a6bd648347cc12bc20/shim.sock" debu>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained IPv6LL
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-udevd[2811]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Error 255 seems to be referenced from cni.go:260 and cni.go:228
@xelor81 Can you give access to VM if I share my public key?
yes I think I can do that. pls send me private mesg and we can organize some time for you to investigate but only for very limited time.
Can anyone support me with this matter - is there anything I am missing here? Issue looks like it is closely related with cni.go code on line 260 and 228 however the only cni.go file in this project has only like 32 lines. I am really stuck here.
@xelor81 I already shared public key with one of the support guys. He told he will contact you regarding this but I didn't get back from him either.
@tamilmani1989 I had added your key to host authorized keys. please use public IP delivered through support person to login to the host. basically all details regarding kubernetes config is stored in cloud init file in /var/lib/waagent/CustomData
@xelor81 I got access to vm. Still i'm not sure the reason for failure. Can i get access to master VM? so that I can create pod and check whats happening in agent node.
This is the master VM. You can see that there are kube-apiserver and kube-scheduler and kube-controller-manager running there.
If you want I can spawn worker node and connect to this master but using same kubelet settings I don't see hwo this may behave different than this master.
In principal I will be running few pods on master too as pat of Daemon set or Deployment designed to help managed cluster.
Any idea what error 255 reported by plugin means?
I had also made previously additional testing and this very same error is shown when instead CoreOS I use ubuntu 16.04 LTS (same version as it is used in AKS service)
I'm getting this error when i run this command. Did you hit same error?
avid-user@jluzny-eastus2-coremaster1 ~ $ kubectl get pods -bash: kubectl: command not found
Thats because there is no kubectl installed there. What I do is create an ssh tunnel to this hosts and configure portforwarding option to the kubeapi port. Then from my local box where I have kubectl installed and having proper kubeconfig in place I am able to manage this cluster.
For t-shoot purposes I had installed kubectl in /opt/bin and incuded in the PATH
@xelor81
kubectl get pods The connection to the server localhost:8080 was refused - did you specify the right host or port?
is it kubectl -s [address]:[port] get pods ? let me know address and port
@tamilmani1989
more likely you havent been using root account. Please sudo to root account. Kubectl wroks for me on this account. Also if you read the very first post in this issue you can find that kubernetes config is available in /var/lib/kubelet/kubeconfig. You can simply copy the content of it to the ~/.kube/config file of account you are using (this is already done in the root account) From the config you can read that kube-api server is running on 6443 default one)
as for the ebtables in the agent nodes - this master is also an agent node. Ebtables is available. As you may read from my previous message the only difference between master and agen nodes is that master run more component like kube-apiserver, kube-scheduler, kube-controller-manager. You may observe that kubelet is present on this master making it also an agent/worker node.
@xelor81 My apologies for delayed response. I got pulled into some other work.
I tried creating pods in your setup and I'm able to. I didn't get any issues. I deployed nginx pods and its in running state. Let me know in which scenario you hit the issue.
avid-user@jluzny-eastus2-coremaster1 ~ $ sudo kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE my-nginx-59497d7745-4b5hh 1/1 Running 0 12s 10.3.6.12 jluzny-eastus2-coremaster1
my-nginx-59497d7745-92cbl 1/1 Running 0 12s 10.3.6.10 jluzny-eastus2-coremaster1
PLS ignore closing issue - I did a missclick.
@tamilmani1989 Indeed the pods are seems to be working now I will make some additional tests to see if this is fixed. I will deploy another kubernetes master and few nodes with same settings and I will get back to you.
BTW its very weird that now pods started to work if nothing was changed on the VM. Any idea what was error 255 mean and which is referenced in azure-vnet.log?
I would appreciate any insight about this for future reference. Because of the security hole found in K8s lately I will now deploy k8s with azur-vnet plugin with following config:
k8s - 1.11.5 azure cni plugin = 1.0.14 cni plugin = 0.7.4
Should I be aware of any blocking points with this config? I was not able to find any issues in the release notes on those components.
BTW its very weird that now pods started to work if nothing was changed on the VM. Any idea what was error 255 mean and which is referenced in azure-vnet.log?
I looked at logs and it seems that ebtables was not installed at that point to me.
Should I be aware of any blocking points with this config? I was not able to find any issues in the release notes on those components.
nothing as far as i know.
weird if ebtables were not installed why it respond to the ebtable -t nat -L command? is plugin utilising like full path to ebatles or ebables command must be included in PATH env variable?
@xelor81 did you run this command before the issue happens or after....because im not sure if its installed before creating pods
ebtables are delivered/installed on CoreOS by default.
oh ok.. @xelor81 is there any update? If the issue is not occurring can we close this?
I am currently on a sick leave. I run initial test on the latest 1.11.5 and latest CoreOS image available from azure cloud. Exit 255 still occurs using same CNI plugin and azure-vnet plugin as before. When used
0.74 CNI plugin and 1.0.14 azure-vnet plugin
I got report from azure-vnet plugin that IP address cannot be delivered as I am out of my IP pool. Exact log I can present later this week as soon as I feel better.
As this is a new VM I need to prepare access for you I presume?
OK I got this thing to work. Instead of using kubelet withing container I used it as binary on CoreOS. Doing this I am able to spawn pods with VNET IP range. Pod can communicate with all hosts in the VNET. Closing issue
Is this a request for help?: Yes
Is this an ISSUE or FEATURE REQUEST? (choose one): Issue
Which release version?: 1.0.12
Which component (CNI/IPAM/CNM/CNS): CNI
Which Operating System (Linux/Windows): Linux
For Linux: Include Distro and kernel version using "uname -a"
Which Orchestrator and version (e.g. Kubernetes, Docker) Kubernetes 1.11.3
What happened: I deployed one single machine with CoreOS using terraform in azure cloud. Using cloud-init I am provisioning systemd and yaml files (with all necessary components like certs) to start kubernetes with following components configured as systemd services:
install cni
etcd 3.1.17
kube-apiserver 1.11.3
kube-scheduler 1.11.3
kube-controller-manager 1.11.3
kubelet 1.11.3
As part of deployment CNI are fetched with azure-vnet.
Kubeconfig looks like:
When I try to spawn a pod (simple centos:7) kubernetes reports it failed to create sandbox container:
Going to azure-vnet.log and azure-vnet-ipam.log I observe theese:
azure-vnet.log
azure-vnet-ipam.log
Iptables rules are as follows:
Chain INPUT (policy ACCEPT 5 packets, 256 bytes) pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 6 packets, 360 bytes) pkts bytes target prot opt in out source destination
58 3480 DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT 6 packets, 360 bytes) pkts bytes target prot opt in out source destination
7413 446K KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 / kubernetes postrouting rules / 0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
Chain DOCKER (2 references) pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
Chain KUBE-MARK-DROP (0 references) pkts bytes target prot opt in out source destination
0 0 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
Chain KUBE-MARK-MASQ (0 references) pkts bytes target prot opt in out source destination
0 0 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-POSTROUTING (1 references) pkts bytes target prot opt in out source destination
0 0 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 / kubernetes service traffic requiring SNAT / mark match 0x4000/0x4000
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination