Azure / azure-container-networking

Azure Container Networking Solutions for Linux and Windows Containers
MIT License
376 stars 241 forks source link

[SOLVED] plugin failed to create network: Exit status 255 #264

Closed xelor81 closed 5 years ago

xelor81 commented 5 years ago

Is this a request for help?: Yes


Is this an ISSUE or FEATURE REQUEST? (choose one): Issue


Which release version?: 1.0.12


Which component (CNI/IPAM/CNM/CNS): CNI


Which Operating System (Linux/Windows): Linux


For Linux: Include Distro and kernel version using "uname -a"

DISTRIB_ID="Container Linux by CoreOS"
DISTRIB_RELEASE=1855.4.0
DISTRIB_CODENAME="Rhyolite"
DISTRIB_DESCRIPTION="Container Linux by CoreOS 1855.4.0 (Rhyolite)"
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1855.4.0
VERSION_ID=1855.4.0
BUILD_ID=2018-09-11-0003
PRETTY_NAME="Container Linux by CoreOS 1855.4.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Which Orchestrator and version (e.g. Kubernetes, Docker) Kubernetes 1.11.3


What happened: I deployed one single machine with CoreOS using terraform in azure cloud. Using cloud-init I am provisioning systemd and yaml files (with all necessary components like certs) to start kubernetes with following components configured as systemd services:

Kubeconfig looks like:

apiVersion: v1
clusters:
- cluster:
    insecure-skip-tls-verify: true
    server: <snipped_httss>:6443
  name: local
- context:
    cluster: local
    user: admin
  name: admin
current-context: admin
kind: Config
preferences: {}
users:
- name: admin
  user:
    password: <snip>
    username: <snip>

When I try to spawn a pod (simple centos:7) kubernetes reports it failed to create sandbox container:

:~$ kubectl -n kube-system describe pod/kube-dns-v20-596c55dc95-cc2p6
Name:           kube-dns-v20-596c55dc95-cc2p6
Namespace:      kube-system
Node:           jluzny-eastus2-coremaster1/10.3.6.4
Start Time:     Wed, 24 Oct 2018 11:04:05 +0200
Labels:         k8s-app=kube-dns
                kubernetes.io/cluster-service=true
                pod-template-hash=1527118751
                version=v20
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/kube-dns-v20-596c55dc95
Containers:
  kubedns:
    Container ID:  
    Image:         k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.10
    Image ID:      
    Ports:         10053/UDP, 10053/TCP
    Host Ports:    0/UDP, 0/TCP
    Args:
      --kubecfg-file=/var/lib/kubelet/kubeconfig
      --domain=cluster.local.
      --dns-port=10053
      --v=2
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/healthz-kubedns delay=60s timeout=1s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8081/readiness delay=30s timeout=5s period=10s #success=1 #failure=5
    Environment:  <none>
    Mounts:
      /var/lib/kubelet from kube-dns-kubelet (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-z7q42 (ro)
  dnsmasq:
    Container ID:  
    Image:         k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.10
    Image ID:      
    Ports:         53/UDP, 53/TCP
    Host Ports:    0/UDP, 0/TCP
    Args:
      -v=2
      -logtostderr
      -restartDnsmasq=true
      --
      -k
      --cache-size=1000
      --no-resolv
      --server=127.0.0.1#10053
      --server=/in-addr.arpa/127.0.0.1#10053
      --server=/ip6.arpa/127.0.0.1#10053
      --log-facility=-
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-z7q42 (ro)
  healthz:
    Container ID:  
    Image:         k8s.gcr.io/exechealthz-amd64:1.2
    Image ID:      
    Port:          8080/TCP
    Host Port:     0/TCP
    Args:
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
      --url=/healthz-dnsmasq
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
      --url=/healthz-kubedns
      --port=8080
      --quiet
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  50Mi
    Requests:
      cpu:        10m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-z7q42 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-dns-kubelet:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet
    HostPathType:  
  kube-dns-token-z7q42:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kube-dns-token-z7q42
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly
Events:
  Type     Reason                  Age                 From                                 Message
  ----     ------                  ----                ----                                 -------
  Normal   Scheduled               18m                 default-scheduler                    Successfully assigned kube-system/kube-dns-v20-596c55dc95-cc2p6 to jluzny-eastus2-coremaster1
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7dfc05ac05a8cc9630c5b8e5c0281fb4217d108688722fc9c75587ff05dadfa1" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "498844a3f8518788b194682712ae6433ebb432c13e81ab47bf35ade23ff4417d" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "99dea5511a6c6f3abdd4fd8d29e0994323aabae5386097735ad656e3513f60ca" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "b3cf12da07669e0b20bf9d7047e4a12e2de9fad37278148c98b67f17b8a3fb38" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8942e2677a0faddd91818189ac96bd0f4a4a441a3a1b3f723744ac3e415cce1a" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "01eddffffa9ae1fdd04ee3cdc47c9162f106aef58dc079a8512f8c9ca13bd8ef" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  17m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "24c880e4045cd0e7961cfa994badde7812b3a9097d8eb480f796eefe199b1d66" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  16m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7d2acd1c228b84fb5892779e36ea6b97e0e50c8814c5550c94568e4f9403ec33" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Warning  FailedCreatePodSandBox  16m                 kubelet, jluzny-eastus2-coremaster1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8c7e788fbd348f807dbf19693c8b23161275abe52de457955f74352dc8ff090e" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255
  Normal   SandboxChanged          16m (x12 over 17m)  kubelet, jluzny-eastus2-coremaster1  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  2m (x99 over 16m)   kubelet, jluzny-eastus2-coremaster1  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "867ea710e7493de0d507ca12d30952692a61b502ade3498c90842a1e818a7a15" network for pod "kube-dns-v20-596c55dc95-cc2p6": NetworkPlugin cni failed to set up pod "kube-dns-v20-596c55dc95-cc2p6_kube-system" network: Failed to create network: exit status 255

Going to azure-vnet.log and azure-vnet-ipam.log I observe theese:

azure-vnet.log

2018/10/24 09:12:47 [cni-net] Plugin azure-vnet version v1.0.12.
2018/10/24 09:12:47 [cni-net] Running on Linux version 4.14.67-coreos (jenkins@ip-10-7-32-103) (gcc version 7.3.0 (Gentoo Hardened 7.3.0 p1.0)) #1 SMP Mon Sep 10 23:14:26 UTC 2018
2018/10/24 09:12:47 [net] Network interface: {Index:1 MTU:65536 Name:lo HardwareAddr: Flags:up|loopback} with IP addresses: [127.0.0.1/8 ::1/128]
2018/10/24 09:12:47 [net] Network interface: {Index:2 MTU:1500 Name:eth0 HardwareAddr:00:0d:3a:0e:23:1a Flags:up|broadcast} with IP addresses: [10.3.6.4/23 fe80::20d:3aff:fe0e:231a/64]
2018/10/24 09:12:47 [net] Network interface: {Index:3 MTU:1500 Name:docker0 HardwareAddr:02:42:58:86:db:7f Flags:up|broadcast|multicast} with IP addresses: [172.17.0.1/16]
2018/10/24 09:12:47 [net] reboot time 2018-10-24 09:00:20 +0000 UTC store mod time 2018-10-24 09:12:44.74818463 +0000 UTC
2018/10/24 09:12:47 [net] Restored state, &{Version:v1.0.12 TimeStamp:2018-10-24 09:12:44.748485534 +0000 UTC ExternalInterfaces:map[eth0:0xc420142a80] store:0xc420122330 Mutex:{state:0 sema:0}}
2018/10/24 09:12:47 External Interface &{eth0 map[] [10.3.6.0/23]  00:0d:3a:0e:23:1a [] [] 0.0.0.0 ::}
2018/10/24 09:12:47 [cni-net] Plugin started.
2018/10/24 09:12:47 [cni-net] Processing ADD command with args {ContainerID:3460e2eb338d7d43c4dac2b410424f108431f0037ed6df4cce9c46ef2d527173 Netns:/proc/54317/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system;K8S_POD_NAME=kube-dns-v20-596c55dc95-cc2p6;K8S_POD_INFRA_CONTAINER_ID=3460e2eb338d7d43c4dac2b410424f108431f0037ed6df4cce9c46ef2d527173 Path:/opt/cni/bin}.
2018/10/24 09:12:47 [cni-net] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:12:47 Result from multitenancy <nil>
2018/10/24 09:12:47 [cni-net] Creating network azure.
2018/10/24 09:12:47 [cni] Calling plugin azure-vnet-ipam ADD nwCfg:&{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:12:47 [cni] Plugin azure-vnet-ipam returned result:IP:[{Version:4 Interface:<nil> Address:{IP:10.3.6.8 Mask:fffffe00} Gateway:10.3.6.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.3.6.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]}, err:<nil>.
2018/10/24 09:12:47 [cni-net] Found master interface eth0.
2018/10/24 09:12:47 [net] Save succeeded.
2018/10/24 09:12:47 [net] Creating network &{Id:azure Mode:bridge Subnets:[{Family:2 Prefix:{IP:10.3.6.0 Mask:fffffe00} Gateway:10.3.6.1}] DNS:{Suffix:kube-system. Servers:[]} Policies:[] BridgeName:azure0 EnableSnatOnHost:false Options:map[]}.
2018/10/24 09:12:47 opt map[] options map[]
2018/10/24 09:12:47 create bridge
2018/10/24 09:12:47 [net] Connecting interface eth0.
2018/10/24 09:12:47 [net] Creating bridge azure0.
2018/10/24 09:12:47 [net] Deleting IP address 10.3.6.4/23 from interface eth0.
2018/10/24 09:12:47 [net] Saved interface IP configuration &{Name:eth0 Networks:map[] Subnets:[10.3.6.0/23] BridgeName: MacAddress:00:0d:3a:0e:23:1a IPAddresses:[10.3.6.4/23] Routes:[0xc4201da3f0] IPv4Gateway:10.3.6.1 IPv6Gateway:::}.
2018/10/24 09:12:47 [net] Setting link eth0 state down.
2018/10/24 09:12:47 [net] Setting link eth0 master azure0.
2018/10/24 09:12:47 [net] Setting link eth0 state up.
2018/10/24 09:12:47 [net] Setting link azure0 state up.
2018/10/24 09:12:47 [net] Adding SNAT rule for egress traffic on eth0.
2018/10/24 09:12:47 [net] Connecting interface eth0 completed with err:exit status 255.
2018/10/24 09:12:47 [net] Failed to create network azure, err:exit status 255.
2018/10/24 09:12:47 [azure-vnet] Failed to create network: exit status 255.
2018/10/24 09:12:47 [cni] Calling plugin azure-vnet-ipam DEL nwCfg:&{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.3.6.0/23 Address:10.3.6.8 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:12:47 [cni] Plugin azure-vnet-ipam returned err:<nil>.
2018/10/24 09:12:47 [cni] Calling plugin azure-vnet-ipam DEL nwCfg:&{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.3.6.0/23 Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:12:47 [cni] Plugin azure-vnet-ipam returned err:<nil>.
2018/10/24 09:12:47 [cni-net] ADD command completed with result:Interfaces:[{Name:eth0 Mac: Sandbox:} {Name:eth0 Mac: Sandbox:}], IP:[{Version:4 Interface:<nil> Address:{IP:10.3.6.8 Mask:fffffe00} Gateway:10.3.6.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.3.6.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:Failed to create network: exit status 255.
2018/10/24 09:12:47 Failed to execute network plugin, err:Failed to create network: exit status 255.
2018/10/24 09:12:47 Report plugin error
2018/10/24 09:12:47 [cni-net] Plugin stopped.

azure-vnet-ipam.log

2018/10/24 09:20:13 [cni-ipam] Plugin azure-vnet-ipam version v1.0.12.
2018/10/24 09:20:13 [cni-ipam] Running on Linux version 4.14.67-coreos (jenkins@ip-10-7-32-103) (gcc version 7.3.0 (Gentoo Hardened 7.3.0 p1.0)) #1 SMP Mon Sep 10 23:14:26 UTC 2018
2018/10/24 09:20:13 [ipam] reboot time 2018-10-24 09:00:20 +0000 UTC store mod time 2018-10-24 09:20:11.528772448 +0000 UTC
2018/10/24 09:20:13 [ipam] Restored state, &{Version:v1.0.12 TimeStamp:2018-10-24 09:20:11.530211255 +0000 UTC AddrSpaces:map[local:0xc420089140] store:0xc420088e10 source:<nil> netApi:<nil> Mutex:{state:0 sema:0}}
2018/10/24 09:20:13 [cni-ipam] Plugin started.
2018/10/24 09:20:13 [cni-ipam] Processing ADD command with args {ContainerID:31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Netns:/proc/86353/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system;K8S_POD_NAME=kube-dns-v20-596c55dc95-cc2p6;K8S_POD_INFRA_CONTAINER_ID=31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Path:/opt/cni/bin}.
2018/10/24 09:20:13 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:20:13 [ipam] Starting source azure.
2018/10/24 09:20:13 [ipam] Refreshing address source.
2018/10/24 09:20:13 [ipam] Save succeeded.
2018/10/24 09:20:13 [ipam] Requesting pool with poolId: options:map[azure.interface.name:] v6:false.
2018/10/24 09:20:13 [ipam] Checking pool 10.3.6.0/23.
2018/10/24 09:20:13 [ipam] Pool 10.3.6.0/23 matches requirements.
2018/10/24 09:20:13 [ipam] Pool request completed with pool:&{as:0xc420089140 Id:10.3.6.0/23 IfName:eth0 Subnet:{IP:10.3.6.0 Mask:fffffe00} Gateway:10.3.6.1 Addresses:map[10.3.6.6:0xc4200984c0 10.3.6.8:0xc420098540 10.3.6.12:0xc4200983c0 10.3.6.13:0xc420098400 10.3.6.14:0xc420098440 10.3.6.5:0xc420098480 10.3.6.10:0xc420098340 10.3.6.11:0xc420098380 10.3.6.7:0xc420098500 10.3.6.9:0xc420098580] addrsByID:map[] IsIPv6:false Priority:0 RefCount:1 epoch:1} err:<nil>.
2018/10/24 09:20:13 [ipam] Save succeeded.
2018/10/24 09:20:13 [cni-ipam] Allocated address poolID 10.3.6.0/23 with subnet 10.3.6.0/23.
2018/10/24 09:20:13 [ipam] Refreshing address source.
2018/10/24 09:20:13 [ipam] Requesting address with address: options:map[].
2018/10/24 09:20:13 [ipam] Address request completed with address:10.3.6.7/23 err:<nil>.
2018/10/24 09:20:13 [ipam] Save succeeded.
2018/10/24 09:20:13 [cni-ipam] Allocated address 10.3.6.7/23.
2018/10/24 09:20:13 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface:<nil> Address:{IP:10.3.6.7 Mask:fffffe00} Gateway:10.3.6.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.3.6.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:<nil>.
2018/10/24 09:20:13 [cni-ipam] Plugin stopped.
2018/10/24 09:20:13 [cni-ipam] Plugin azure-vnet-ipam version v1.0.12.
2018/10/24 09:20:13 [cni-ipam] Running on Linux version 4.14.67-coreos (jenkins@ip-10-7-32-103) (gcc version 7.3.0 (Gentoo Hardened 7.3.0 p1.0)) #1 SMP Mon Sep 10 23:14:26 UTC 2018
2018/10/24 09:20:13 [ipam] reboot time 2018-10-24 09:00:20 +0000 UTC store mod time 2018-10-24 09:20:13.799784871 +0000 UTC
2018/10/24 09:20:13 [ipam] Restored state, &{Version:v1.0.12 TimeStamp:2018-10-24 09:20:13.801290879 +0000 UTC AddrSpaces:map[local:0xc420089140] store:0xc420088e10 source:<nil> netApi:<nil> Mutex:{state:0 sema:0}}
2018/10/24 09:20:13 [cni-ipam] Plugin started.
2018/10/24 09:20:13 [cni-ipam] Processing DEL command with args {ContainerID:31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Netns:/proc/86353/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system;K8S_POD_NAME=kube-dns-v20-596c55dc95-cc2p6;K8S_POD_INFRA_CONTAINER_ID=31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Path:/opt/cni/bin}.
2018/10/24 09:20:13 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.3.6.0/23 Address:10.3.6.7 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:20:13 [ipam] Starting source azure.
2018/10/24 09:20:13 [ipam] Refreshing address source.
2018/10/24 09:20:13 [ipam] Source refresh failed, err:Get http://169.254.169.254/machine/plugins?comp=nmagent&type=getinterfaceinfov1: dial tcp 169.254.169.254:80: connect: network is unreachable.
2018/10/24 09:20:13 [ipam] Releasing address with address:10.3.6.7 options:map[].
2018/10/24 09:20:13 [ipam] Address release completed with address:10.3.6.7 err:<nil>.
2018/10/24 09:20:13 [ipam] Save succeeded.
2018/10/24 09:20:13 [cni-ipam] DEL command completed with err:<nil>.
2018/10/24 09:20:13 [cni-ipam] Plugin stopped.
2018/10/24 09:20:13 [cni-ipam] Plugin azure-vnet-ipam version v1.0.12.
2018/10/24 09:20:13 [cni-ipam] Running on Linux version 4.14.67-coreos (jenkins@ip-10-7-32-103) (gcc version 7.3.0 (Gentoo Hardened 7.3.0 p1.0)) #1 SMP Mon Sep 10 23:14:26 UTC 2018
2018/10/24 09:20:13 [ipam] reboot time 2018-10-24 09:00:20 +0000 UTC store mod time 2018-10-24 09:20:13.870785259 +0000 UTC
2018/10/24 09:20:13 [ipam] Restored state, &{Version:v1.0.12 TimeStamp:2018-10-24 09:20:13.871946765 +0000 UTC AddrSpaces:map[local:0xc420085140] store:0xc420084e10 source:<nil> netApi:<nil> Mutex:{state:0 sema:0}}
2018/10/24 09:20:13 [cni-ipam] Plugin started.
2018/10/24 09:20:13 [cni-ipam] Processing DEL command with args {ContainerID:31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Netns:/proc/86353/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system;K8S_POD_NAME=kube-dns-v20-596c55dc95-cc2p6;K8S_POD_INFRA_CONTAINER_ID=31427e6bc8e4063070bdc70d5547cea8a5422a7790f97cc1f6849d825054396a Path:/opt/cni/bin}.
2018/10/24 09:20:13 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: InfraVnetAddressSpace: PodNamespaceForDualNetwork:[] MultiTenancy:false EnableSnatOnHost:false EnableExactMatchForPodName:false Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.3.6.0/23 Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/10/24 09:20:13 [ipam] Starting source azure.
2018/10/24 09:20:13 [ipam] Refreshing address source.
2018/10/24 09:20:13 [ipam] Source refresh failed, err:Get http://169.254.169.254/machine/plugins?comp=nmagent&type=getinterfaceinfov1: dial tcp 169.254.169.254:80: connect: network is unreachable.
2018/10/24 09:20:13 [ipam] Releasing pool with poolId:10.3.6.0/23.
2018/10/24 09:20:13 [ipam] Save succeeded.
2018/10/24 09:20:13 [cni-ipam] DEL command completed with err:<nil>.
2018/10/24 09:20:13 [cni-ipam] Plugin stopped.

Iptables rules are as follows:

Chain INPUT (policy ACCEPT 209K packets, 50M bytes)
 pkts bytes target     prot opt in     out     source               destination         
 209K   50M KUBE-FIREWALL  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 221K packets, 49M bytes)
 pkts bytes target     prot opt in     out     source               destination         
 223K   49M KUBE-FIREWALL  all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain KUBE-FIREWALL (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain INPUT (policy ACCEPT 5 packets, 256 bytes) pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 6 packets, 360 bytes) pkts bytes target prot opt in out source destination
58 3480 DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 6 packets, 360 bytes) pkts bytes target prot opt in out source destination
7413 446K KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 / kubernetes postrouting rules / 0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0

Chain DOCKER (2 references) pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0

Chain KUBE-MARK-DROP (0 references) pkts bytes target prot opt in out source destination
0 0 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000

Chain KUBE-MARK-MASQ (0 references) pkts bytes target prot opt in out source destination
0 0 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000

Chain KUBE-POSTROUTING (1 references) pkts bytes target prot opt in out source destination
0 0 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 / kubernetes service traffic requiring SNAT / mark match 0x4000/0x4000


- MANGLE table

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination

Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination



---

**What you expected to happen**:

Pods are created with IP in azure-vnet subnet (CIDR = 10.3.6.0/23).

---

**How to reproduce it** (as minimally and precisely as possible):

Deploy CoreOS machine in azure cloud in some vnet with subnet 10.3.6.0/23
Use the systemd provided above. Make necessary updates on kubeconfig to match your environment

---

**Anything else we need to know**:

---
sharmasushant commented 5 years ago

@xelor81 Can you please check if you have ebtables installed on the VM? I am assuming this is a vm running in Azure.

xelor81 commented 5 years ago

@sharmasushant I have checked and ebtables is present but there is no rules defined

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT


- ebtables NAT table

jluzny-eastus2-coremaster1 ~ # ebtables -t nat -L
Bridge table: nat

Bridge chain: PREROUTING, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT


Yes VM is located in azure cloud. CoreOS is taken from azure available image. Details of the image:

image_publisher = "CoreOS" image_offer = "CoreOS" image_sku = "Stable" image_version = "latest"


Word of comment I had just installed bare ubuntu 16.04 (using resource manager in azure with default settings) then I installed docker on it using doc this guide: https://docs.docker.com/cs-engine/1.13/
Later I created the systemd and all related components and it got me to the same error 255.

I also poke on the AKS node for comparison and it seems AKS node based on ubuntu 16.04 ebatles is also present and ebtables for NAT table are defined

root@aks-es-85070152-0:~# ebtables -t nat -L Bridge table: nat

Bridge chain: PREROUTING, entries: 15, policy: ACCEPT -p ARP -i eth0 --arp-op Reply -j dnat --to-dst ff:ff:ff:ff:ff:ff --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.50 -j arpreply --arpreply-mac 62:fc:29:12:f5:94 -p IPv4 -i eth0 --ip-dst 10.3.6.50 -j dnat --to-dst 62:fc:29:12:f5:94 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.53 -j arpreply --arpreply-mac 9a:c2:9e:dd:94:f7 -p IPv4 -i eth0 --ip-dst 10.3.6.53 -j dnat --to-dst 9a:c2:9e:dd:94:f7 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.36 -j arpreply --arpreply-mac 36:20:3f:33:ab:45 -p IPv4 -i eth0 --ip-dst 10.3.6.36 -j dnat --to-dst 36:20:3f:33:ab:45 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.41 -j arpreply --arpreply-mac 7a:3e:61:c5:c2:13 -p IPv4 -i eth0 --ip-dst 10.3.6.41 -j dnat --to-dst 7a:3e:61:c5:c2:13 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.51 -j arpreply --arpreply-mac 36:9c:5c:d2:93:a7 -p IPv4 -i eth0 --ip-dst 10.3.6.51 -j dnat --to-dst 36:9c:5c:d2:93:a7 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.37 -j arpreply --arpreply-mac fa:b4:3a:26:25:96 -p IPv4 -i eth0 --ip-dst 10.3.6.37 -j dnat --to-dst fa:b4:3a:26:25:96 --dnat-target ACCEPT -p ARP --arp-op Request --arp-ip-dst 10.3.6.40 -j arpreply --arpreply-mac 76:2e:aa:8e:df:64 -p IPv4 -i eth0 --ip-dst 10.3.6.40 -j dnat --to-dst 76:2e:aa:8e:df:64 --dnat-target ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT -s Unicast -o eth0 -j snat --to-src 0:d:3a:3:4d:4a --snat-arp --snat-target ACCEPT

xelor81 commented 5 years ago

I had also find some details in journalctl which indicates that azure- cannot generate persistent MAC address for that interface.

Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448798    1478 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "d8bcef579082154f8>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448842    1478 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e231a)" failed: rpc error>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448857    1478 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e231a)" failed: rpc erro>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: E1023 12:52:38.448938    1478 pod_workers.go:186] Error syncing pod 8285d63e-d605-11e8-81d5-000d3a0e231a ("kube-dns-v20-596c55dc95-dqdhp_kube-system(8285d63e-d605-11e8-81d5-000d3a0e>
Oct 23 12:52:38 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:38.450187    1478 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-dqdhp_kub>
Oct 23 12:52:39 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:39.656223    1478 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-dqdhp_kub>
Oct 23 12:52:39 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:39.722284    1478 pod_container_deletor.go:75] Container "d8bcef579082154f8623fecd251c170def074d47045ba614159eb387314cd427" not found in pod's containers
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:40.024821    1478 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "d8bcef579082154f8623fecd251c170d>
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] Telemetry sent with status code 200
Oct 23 12:52:40 jluzny-eastus2-coremaster1 docker[1166]: 2018/10/23 12:52:40 [Telemetry] SetReportState succeeded
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/c41d026ded44808f1e94a5ab95a689f61b3ef305db66908ca20bd0bd76a6b476/shim.sock" debu>
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=debug msg="registering ttrpc server"
Oct 23 12:52:41 jluzny-eastus2-coremaster1 env[712]: time="2018-10-23T12:52:41Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 23 12:52:43 jluzny-eastus2-coremaster1 docker[1166]: W1023 12:52:43.765710    1478 pod_container_deletor.go:75] Container "c41d026ded44808f1e94a5ab95a689f61b3ef305db66908ca20bd0bd76a6b476" not found in pod's containers
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-udevd[41910]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-udevd[41910]: Could not generate persistent MAC address for azure0: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Lost carrier
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: DHCP lease lost
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 23 12:52:43 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot configure proxy NDP for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot configure IPv6 privacy extension for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot disable kernel IPv6 accept_ra for interface: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: docker0: Link is not managed by us
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: azure0: Cannot enable IPv6 for interface azure0: No such file or directory
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Gained carrier
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: Link readded
Oct 23 12:52:43 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.
Oct 23 12:52:44 jluzny-eastus2-coremaster1 systemd-networkd[611]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 23 12:52:44 jluzny-eastus2-coremaster1 systemd-timesyncd[682]: No network connectivity, watching for changes.

After looking some answer on the error in journalctl log I applied a work-around https://github.com/naspersclassifieds-shared/coreos-kubernetes/commit/b18adcafabae0f20469c3530027a0f3aca09b5ba

and now the issue with the persistent MAC address are no longer visible in the logs however there are other ones to see.

Oct 29 08:54:45 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:54:45 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:55:15 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:55:15 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:55:45 jluzny-eastus2-coremaster1 docker[1642]: 2018/10/29 08:55:45 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1083]: I1029 08:56:00.367225       1 controller.go:597] quota admission added evaluator for: {apps replicasets}
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1085]: I1029 08:56:00.369241       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"kube-system", Name:"kube-dns-v20", UID:"c1f07169-d76b-11e8-80c0-000d3a0e231a", API>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1085]: I1029 08:56:00.398902       1 event.go:221] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"kube-system", Name:"kube-dns-v20-596c55dc95", UID:"c1f993fe-d76b-11e8-80c0-000d3a0>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1138]: I1029 08:56:00.798363    1490 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-dns-kubelet" (UniqueName: "kubernetes.io/host-path/75280c1>
Oct 29 08:56:00 jluzny-eastus2-coremaster1 docker[1138]: I1029 08:56:00.798500    1490 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-dns-token-z7q42" (UniqueName: "kubernetes.io/secret/75280c>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90/shim.sock" debu>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:02 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:02Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] File not exist /var/run/AzureCNITelemetry.json
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 GetReport state file didn't exist. Setting flag to true
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] &{IsNewInstance:true CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:02 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:02 [Telemetry] SetReportState succeeded
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-udevd[2300]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: Network configuration changed, trying to establish connection.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:02 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained carrier
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: IPv6 successfully enabled
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:02 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:03 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] &{IsNewInstance:false CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage:Failed to create network: exit status 255 Context:AzureCNI SubContex>
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:03 [Telemetry] SetReportState succeeded
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:03.900107    1490 cni.go:260] Error adding network: Failed to create network: exit status 255
Oct 29 08:56:03 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:03.900135    1490 cni.go:228] Error while adding to cni network: Failed to create network: exit status 255
Oct 29 08:56:03 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained IPv6LL
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:04 [Telemetry] SetReportState succeeded
Oct 29 08:56:04 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:04Z" level=info msg="shim reaped" id=ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90
Oct 29 08:56:04 jluzny-eastus2-coremaster1 env[891]: time="2018-10-29T08:56:04.148924485Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280245    1490 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "ae53a77183a4845f0>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280316    1490 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc error>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280341    1490 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc erro>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:04.280432    1490 pod_workers.go:186] Error syncing pod 75280c11-db58-11e8-a1bb-000d3a0e231a ("kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.543128    1490 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "kube-dns-v20-596c55dc95-6vgpx_kub>
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.543989    1490 pod_container_deletor.go:75] Container "ae53a77183a4845f0ef7ee0e0d8f632601647726d73e35d36a1116ee30d4bf90" not found in pod's containers
Oct 29 08:56:04 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:04.846602    1490 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "ae53a77183a4845f0ef7ee0e0d8f6326>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:05 [Telemetry] SetReportState succeeded
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b/shim.sock" debu>
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:05 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:05Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:05 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:05.678641    1490 pod_container_deletor.go:75] Container "2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b" not found in pod's containers
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-udevd[2555]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: IPv6: ADDRCONF(NETDEV_UP): azure0: link is not ready
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:05 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot configure proxy NDP for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot configure IPv6 privacy extension for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot disable kernel IPv6 accept_ra for interface: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: docker0: Link is not managed by us
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: azure0: Cannot enable IPv6 for interface azure0: No such file or directory
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained carrier
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Link readded
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCPv4 address 10.3.6.4/23 via 10.3.6.1
Oct 29 08:56:05 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] &{IsNewInstance:false CniSucceeded:false Name:CNI OSVersion:v1.0.12 ErrorMessage:Failed to create network: exit status 255 Context:AzureCNI SubContex>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] SetReportState succeeded
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.064461    1490 cni.go:260] Error adding network: Failed to create network: exit status 255
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.064500    1490 cni.go:228] Error while adding to cni network: Failed to create network: exit status 255
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:06 [Telemetry] SetReportState succeeded
Oct 29 08:56:06 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:06Z" level=info msg="shim reaped" id=2b69cdef7b72a0d3ca52cead58751fd832e4cc7ebc9bfd4e3193e06f86246f6b
Oct 29 08:56:06 jluzny-eastus2-coremaster1 env[891]: time="2018-10-29T08:56:06.304017369Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517197    1490 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to set up sandbox container "2b69cdef7b72a0d3c>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517244    1490 kuberuntime_sandbox.go:56] CreatePodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc error>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517278    1490 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e231a)" failed: rpc erro>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: E1029 08:56:06.517549    1490 pod_workers.go:186] Error syncing pod 75280c11-db58-11e8-a1bb-000d3a0e231a ("kube-dns-v20-596c55dc95-6vgpx_kube-system(75280c11-db58-11e8-a1bb-000d3a0e>
Oct 29 08:56:06 jluzny-eastus2-coremaster1 docker[1138]: W1029 08:56:06.983542    1490 cni.go:243] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "2b69cdef7b72a0d3ca52cead58751fd8>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] &{IsNewInstance:false CniSucceeded:true Name:CNI OSVersion:v1.0.12 ErrorMessage: Context:AzureCNI SubContext: VnetAddressSpace:[] OrchestratorDetails>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] Telemetry sent with status code 200
Oct 29 08:56:07 jluzny-eastus2-coremaster1 docker[1138]: 2018/10/29 08:56:07 [Telemetry] SetReportState succeeded
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/52917de9105251c40619b8e9960f247e9e7aa27c23df37a6bd648347cc12bc20/shim.sock" debu>
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=debug msg="registering ttrpc server"
Oct 29 08:56:07 jluzny-eastus2-coremaster1 env[720]: time="2018-10-29T08:56:07Z" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Gained IPv6LL
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-udevd[2811]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: device eth0 entered promiscuous mode
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered blocking state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered forwarding state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: device eth0 left promiscuous mode
Oct 29 08:56:07 jluzny-eastus2-coremaster1 kernel: azure0: port 1(eth0) entered disabled state
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: Lost carrier
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-networkd[619]: eth0: DHCP lease lost
Oct 29 08:56:07 jluzny-eastus2-coremaster1 systemd-timesyncd[692]: No network connectivity, watching for changes.

Error 255 seems to be referenced from cni.go:260 and cni.go:228

tamilmani1989 commented 5 years ago

@xelor81 Can you give access to VM if I share my public key?

xelor81 commented 5 years ago

yes I think I can do that. pls send me private mesg and we can organize some time for you to investigate but only for very limited time.

xelor81 commented 5 years ago

Can anyone support me with this matter - is there anything I am missing here? Issue looks like it is closely related with cni.go code on line 260 and 228 however the only cni.go file in this project has only like 32 lines. I am really stuck here.

tamilmani1989 commented 5 years ago

@xelor81 I already shared public key with one of the support guys. He told he will contact you regarding this but I didn't get back from him either.

xelor81 commented 5 years ago

@tamilmani1989 I had added your key to host authorized keys. please use public IP delivered through support person to login to the host. basically all details regarding kubernetes config is stored in cloud init file in /var/lib/waagent/CustomData

tamilmani1989 commented 5 years ago

@xelor81 I got access to vm. Still i'm not sure the reason for failure. Can i get access to master VM? so that I can create pod and check whats happening in agent node.

xelor81 commented 5 years ago

This is the master VM. You can see that there are kube-apiserver and kube-scheduler and kube-controller-manager running there.

If you want I can spawn worker node and connect to this master but using same kubelet settings I don't see hwo this may behave different than this master.

In principal I will be running few pods on master too as pat of Daemon set or Deployment designed to help managed cluster.

Any idea what error 255 reported by plugin means?

I had also made previously additional testing and this very same error is shown when instead CoreOS I use ubuntu 16.04 LTS (same version as it is used in AKS service)

tamilmani1989 commented 5 years ago

I'm getting this error when i run this command. Did you hit same error?

avid-user@jluzny-eastus2-coremaster1 ~ $ kubectl get pods -bash: kubectl: command not found

xelor81 commented 5 years ago

Thats because there is no kubectl installed there. What I do is create an ssh tunnel to this hosts and configure portforwarding option to the kubeapi port. Then from my local box where I have kubectl installed and having proper kubeconfig in place I am able to manage this cluster.

For t-shoot purposes I had installed kubectl in /opt/bin and incuded in the PATH

tamilmani1989 commented 5 years ago

@xelor81

  1. Quick thought - Have you installed ebtables in agent nodes also?
  2. which server and port is kubeserver running? I tried this

kubectl get pods The connection to the server localhost:8080 was refused - did you specify the right host or port?

is it kubectl -s [address]:[port] get pods ? let me know address and port

xelor81 commented 5 years ago

@tamilmani1989

more likely you havent been using root account. Please sudo to root account. Kubectl wroks for me on this account. Also if you read the very first post in this issue you can find that kubernetes config is available in /var/lib/kubelet/kubeconfig. You can simply copy the content of it to the ~/.kube/config file of account you are using (this is already done in the root account) From the config you can read that kube-api server is running on 6443 default one)

as for the ebtables in the agent nodes - this master is also an agent node. Ebtables is available. As you may read from my previous message the only difference between master and agen nodes is that master run more component like kube-apiserver, kube-scheduler, kube-controller-manager. You may observe that kubelet is present on this master making it also an agent/worker node.

tamilmani1989 commented 5 years ago

@xelor81 My apologies for delayed response. I got pulled into some other work.

I tried creating pods in your setup and I'm able to. I didn't get any issues. I deployed nginx pods and its in running state. Let me know in which scenario you hit the issue.

avid-user@jluzny-eastus2-coremaster1 ~ $ sudo kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE my-nginx-59497d7745-4b5hh 1/1 Running 0 12s 10.3.6.12 jluzny-eastus2-coremaster1 my-nginx-59497d7745-92cbl 1/1 Running 0 12s 10.3.6.10 jluzny-eastus2-coremaster1

xelor81 commented 5 years ago

PLS ignore closing issue - I did a missclick.

@tamilmani1989 Indeed the pods are seems to be working now I will make some additional tests to see if this is fixed. I will deploy another kubernetes master and few nodes with same settings and I will get back to you.

BTW its very weird that now pods started to work if nothing was changed on the VM. Any idea what was error 255 mean and which is referenced in azure-vnet.log?

I would appreciate any insight about this for future reference. Because of the security hole found in K8s lately I will now deploy k8s with azur-vnet plugin with following config:

k8s - 1.11.5 azure cni plugin = 1.0.14 cni plugin = 0.7.4

Should I be aware of any blocking points with this config? I was not able to find any issues in the release notes on those components.

tamilmani1989 commented 5 years ago

BTW its very weird that now pods started to work if nothing was changed on the VM. Any idea what was error 255 mean and which is referenced in azure-vnet.log?

I looked at logs and it seems that ebtables was not installed at that point to me.

Should I be aware of any blocking points with this config? I was not able to find any issues in the release notes on those components.

nothing as far as i know.

xelor81 commented 5 years ago

weird if ebtables were not installed why it respond to the ebtable -t nat -L command? is plugin utilising like full path to ebatles or ebables command must be included in PATH env variable?

tamilmani1989 commented 5 years ago

@xelor81 did you run this command before the issue happens or after....because im not sure if its installed before creating pods

xelor81 commented 5 years ago

ebtables are delivered/installed on CoreOS by default.

tamilmani1989 commented 5 years ago

oh ok.. @xelor81 is there any update? If the issue is not occurring can we close this?

xelor81 commented 5 years ago

I am currently on a sick leave. I run initial test on the latest 1.11.5 and latest CoreOS image available from azure cloud. Exit 255 still occurs using same CNI plugin and azure-vnet plugin as before. When used

0.74 CNI plugin and 1.0.14 azure-vnet plugin

I got report from azure-vnet plugin that IP address cannot be delivered as I am out of my IP pool. Exact log I can present later this week as soon as I feel better.

As this is a new VM I need to prepare access for you I presume?

xelor81 commented 5 years ago

OK I got this thing to work. Instead of using kubelet withing container I used it as binary on CoreOS. Doing this I am able to spawn pods with VNET IP range. Pod can communicate with all hosts in the VNET. Closing issue