projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.05k stars 1.35k forks source link

DNSConfig support on windows pods seems to not work? #4307

Open jayunit100 opened 3 years ago

jayunit100 commented 3 years ago

Expected Behavior

custom dnsConfig options should work for windows containers. this may or may not be a calico bug, depending on where we assign the requirement of container DNS. Since the CNI.spec actually defines dns and there are issues hinting at this, we might

Current Behavior

On EKS with calico, dns records cant be injected through the dnsConfig field, as is done in normal linux containers. I suspect this is simply a no-op currently in calico. Interesting i found https://github.com/projectcalico/cni-plugin/issues/67, which seems to imply that at some point, dns for calico via the CNI provider... might be "a thing"? But there wasnt sufficient motivation to implement it.

Maybe, @caseydavenport ... now with windows support, the time has arrived :) ?

Excuse me if i might be missing some context though, as things might have changed alot since 2016 since that original issue was filed.

In any case, here is what ipconfig sais inside a calico EKS windows pod, which has a custom1.1.1.1 rule injected (note, the rule is not in there...),. Also, if i checked the windows/system32/drivers/etc/hosts file, i also dont see this file.

Im not sure yet, wether the kubelet, or cni, should be responsible for DNSConfig inside windows nodes , so , maybe the answer to that question will really tell us wether this is a bug, or just a pheneomenon that the kubelet itself should be able to manage (for example, maybe it needs to make containerd/docker dns aware via some other mechanism).

PS C:\> ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : e2e-dns-utils
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : dns-7718.svc.cluster.local
                                       svc.cluster.local
                                       cluster.local

Ethernet adapter vEthernet (cid-b6747ba75ef66e5928dff0d866c64001b7995d858c3616046ca88d1ba6ee5a38):

   Connection-specific DNS Suffix  . : dns-7718.svc.cluster.local
   Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #3
   Physical Address. . . . . . . . . : 00-15-5D-AC-81-6F
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::e8f6:b627:470c:6fb6%24(Preferred)
   IPv4 Address. . . . . . . . . . . : 192.168.23.154(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.224.0
   Default Gateway . . . . . . . . . : 192.168.0.1
   DNS Servers . . . . . . . . . . . : 10.100.0.10
   NetBIOS over Tcpip. . . . . . . . : Disabled
   Connection-specific DNS Suffix Search List :
                                       dns-7718.svc.cluster.local
                                       svc.cluster.local
                                       cluster.local

Possible Solution

Steps to Reproduce (for bugs)

  1. setup a cluster on EKS with calico according to https://aws.amazon.com/blogs/containers/open-source-calico-for-windows-containers-on-amazon-eks/
  2. Run the e2e test package (i.e. for windows, i do wget https://storage.googleapis.com/jayunit100/content/e2e.test.exe, followed by C:\Users\jayun\Dropbox\SYNC\executables\content\e2e.test.exe --provider=skeleton --kubeconfig=$HOME/.kube/config --ginkgo.focus="should support configurable pod DNS server" --ginkgo.skip="Testpattern|Slow|plugin|SecurityContext" --dump-logs-on-failure=false --node-os-distro=windows
  3. NOTE you must run the specific e2e.test.exe file above because there was a bug in upstream on EKS e2es, which is fixed there, so that the containers schedule properly :). This should merge to k8s masterbranch soon.

Your Environment

fasaxc commented 3 years ago

Feels wrong that this is handled by kubelet on Linux but gets passed down to the CNI plugin on Windows. What business do we have modifying files inside the container?

jayunit100 commented 3 years ago

So, Evidently DNS is part of the spec for CNI !

fasaxc commented 3 years ago

Can you link to the spec for dnsConfig? Can't seem to find it.

jayunit100 commented 3 years ago

https://github.com/containernetworking/cni/blob/master/SPEC.md#parameters

song-jiang commented 3 years ago

I think the issue is Calico CNI plugin uses hcsshim v1 API to program network endpoints. v1 API exposes DNSServerList and DNSSuffix as the only options for DNS config. https://github.com/microsoft/hcsshim/blob/master/internal/hns/hnsendpoint.go#L21

Calico CNI plugin should switch to v2 API so all fields of DNSConfig can be programmed to network endpoints. https://github.com/microsoft/hcsshim/blob/fd21b8d1922c7fb8b4a50c76d048fe1a69b7e7dc/hcn/hcnnetwork.go#L45

I would say it is bug.

jayunit100 commented 3 years ago

ahh ok ! thanks

jayunit100 commented 3 years ago

FWIW, this is what the records should look lik.e..

PS C:\> ipconfig.exe /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : e2e-dns-utils
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : resolv.conf.local

Ethernet adapter vEthernet (f401cb5f-eth0):

   Connection-specific DNS Suffix  . : resolv.conf.local
   Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #4
   Physical Address. . . . . . . . . : 00-15-5D-71-09-8C
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::c50a:d836:4ce7:a0a3%28(Preferred)
   IPv4 Address. . . . . . . . . . . : 10.240.0.39(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.0.0
   Default Gateway . . . . . . . . . : 10.240.0.1
   DNS Servers . . . . . . . . . . . : 1.1.1.1
   NetBIOS over Tcpip. . . . . . . . : Disabled
   Connection-specific DNS Suffix Search List :
                                       resolv.conf.local
jayunit100 commented 3 years ago

i just finished updating https://github.com/kubernetes/kubernetes/pull/97987 to expose this directly , just fyi might be useful as verifier

song-jiang commented 3 years ago

Thanks @jayunit100 for more details. My current understanding is there are two issues.

I suggest we fix the first one with Calico v3.18 release. This would support most use cases and allow k8s Windows e2e to pass. @caseydavenport @fasaxc

type PodDNSConfig struct {
    // A list of DNS name server IP addresses.
    // This will be appended to the base nameservers generated from DNSPolicy.
    // Duplicated nameservers will be removed.
    // +optional
    Nameservers []string `json:"nameservers,omitempty" protobuf:"bytes,1,rep,name=nameservers"`
    // A list of DNS search domains for host-name lookup.
    // This will be appended to the base search paths generated from DNSPolicy.
    // Duplicated search paths will be removed.
    // +optional
    Searches []string `json:"searches,omitempty" protobuf:"bytes,2,rep,name=searches"`
    // A list of DNS resolver options.
    // This will be merged with the base options generated from DNSPolicy.
    // Duplicated entries will be removed. Resolution options given in Options
    // will override those that appear in the base DNSPolicy.
    // +optional
    Options []PodDNSConfigOption `json:"options,omitempty" protobuf:"bytes,3,rep,name=options"`
}
song-jiang commented 3 years ago

@jayunit100 Are you running your cluster on EKS? If that is the case, are you using AWS CNI on Windows nodes?

jayunit100 commented 3 years ago

in my case, the eks cluster is using calico for the cni...

orest-gulman commented 5 months ago

I have same behaviour in case of AWS CNI, basically dnsConfig are ignored for windows pod. Appreciate for any suggestions.