microsoft / SDN

This repo includes PowerShell scripts and VMM service templates for setting up the Microsoft Software Defined Networking (SDN) Stack using Windows Server 2016
Other
487 stars 541 forks source link

DNS Server does not get passed down to the other containers from the same compartment #222

Open alinbalutoiu opened 6 years ago

alinbalutoiu commented 6 years ago

Environment details:

PS C:\> Get-HnsNetwork

ActivityId             : 5ecd337a-f474-4f4d-83be-799f9bff7208
AutomaticDNS           : True
CurrentEndpointCount   : 1
Extensions             : {@{Id=e7c3b2f0-f3c5-48df-af2b-10fed6d72e7a; IsEnabled=False; Name=Microsoft Windows Filtering Platform}, @{Id=e9b59cfa-2be1-4b21-828f-b6fbdbddc017;
                         IsEnabled=False; Name=Microsoft Azure VFP Switch Extension}, @{Id=ea24cd6c-d17a-4348-9190-09f0d5be83dd; IsEnabled=False; Name=Microsoft NDIS Capture}}
ID                     : f0c9ad1f-cfce-400f-8ebe-f10f50dac9b6
LayerResources         : @{AllocationOrder=1; Allocators=System.Object[]; ID=2ef39325-e296-4cef-8524-f7a0f985609d; PortOperationTime=0; State=1; SwitchOperationTime=0;
                         VfpOperationTime=0}
LayeredOn              : 0f47af72-912a-4623-8550-35e872d39c89
MacPools               : {@{EndMacAddress=00-15-5D-B9-3F-FF; StartMacAddress=00-15-5D-B9-30-00}}
MaxConcurrentEndpoints : 1
Name                   : nat
Policies               : {}
Resources              : @{AllocationOrder=2; Allocators=System.Object[]; ID=5ecd337a-f474-4f4d-83be-799f9bff7208; PortOperationTime=0; State=1; SwitchOperationTime=0;
                         VfpOperationTime=0; parentId=2ef39325-e296-4cef-8524-f7a0f985609d}
State                  : 1
Subnets                : {@{AddressPrefix=172.18.224.0/20; GatewayAddress=172.18.224.1}}
TotalEndpoints         : 2
Type                   : nat
Version                : 30064771074
PS C:\> docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
124b7266ecb5        nat                 nat                 local
93fba08259ff        none                null                local

PS C:\> docker version
Client:
 Version:      17.11.0-ce-rc3
 API version:  1.34
 Go version:   go1.8.4
 Git commit:   5b4af4f
 Built:        Wed Nov  8 03:03:30 2017
 OS/Arch:      windows/amd64

Server:
 Version:      17.11.0-ce-rc3
 API version:  1.34 (minimum version 1.24)
 Go version:   go1.8.5
 Git commit:   5b4af4f
 Built:        Wed Nov  8 03:13:48 2017
 OS/Arch:      windows/amd64
 Experimental: false
PS C:\> wmic os get buildnumber,version
BuildNumber  Version
17134        10.0.17134

Windows server 1803 with all the updates installed.

Any docker version, for this test using 17-11.0-ce-rc3.

Default docker network, nothing extra created, nothing extra installed, plain Windows server 1803 with Containers feature installed.

Steps to reproduce:

  1. Create test_endpoints HNS Endpoint with DNS Server set:
    
    $endpoint = @{
    VirtualNetwork = "f0c9ad1f-cfce-400f-8ebe-f10f50dac9b6";
    Policies       = @();
    Name = "test_endpoint";
    IPAddress      = "172.18.224.10";
    GatewayAddress = "172.18.224.1";
    DNSServerList = "8.8.8.8"
    }

$EndpointData = convertto-json $endpoint Invoke-HNSRequest -Method POST -Type endpoints -Data $EndpointData

Using the powershell module https://github.com/Microsoft/SDN/blob/master/Kubernetes/windows/hns.psm1

2. Boot up container, attach it to None network (Kubernetes case for infra container):

docker run --net=none -it microsoft/nanoserver:1803 cmd


3. Attach the endpoint to the container created above:

PS C:> $cid="df7f369331b5633876f22f47ca604ec6d1d1ab912b20fd43c9bc8db262a3c9a7" PS C:> $eid="5a1b06c6-773b-4889-ba63-e56d6008df4b" PS C:> $cmp=2 PS C:> Attach-HNSEndpoint -ContainerID $cid -CompartmentID $cmp -EndpointID $eid

Success. You can see in the container the DNS server is set:

C:>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : df7f369331b5 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : localdomain

Ethernet adapter vEthernet (test_endpoint):

Connection-specific DNS Suffix . : localdomain Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2 Physical Address. . . . . . . . . : 00-15-5D-B9-31-6F DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes Link-local IPv6 Address . . . . . : fe80::a44d:c463:592a:3de0%17(Preferred) IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.240.0 Default Gateway . . . . . . . . . : 172.18.224.1 DNS Servers . . . . . . . . . . . : 8.8.8.8 NetBIOS over Tcpip. . . . . . . . : Disabled


4. Spin up a new container and attach it to the same network via docker:

PS C:> docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES df7f369331b5 microsoft/nanoserver:1803 "cmd" 13 minutes ago Up 13 minutes goofy_boyd PS C:> docker run --net=container:df7f369331b5 -it microsoft/nanoserver:1803 cmd

Works fine. Container gets created.

PS C:> docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9e19087e0049 microsoft/nanoserver:1803 "cmd" 12 minutes ago Up 12 minutes trusting_carson df7f369331b5 microsoft/nanoserver:1803 "cmd" 14 minutes ago Up 14 minutes goofy_boyd PS C:> docker exec df7f369331b5 ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : df7f369331b5 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : localdomain

Ethernet adapter vEthernet (test_endpoint):

Connection-specific DNS Suffix . : localdomain Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2 Physical Address. . . . . . . . . : 00-15-5D-B9-31-6F DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes Link-local IPv6 Address . . . . . : fe80::a44d:c463:592a:3de0%17(Preferred) IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.240.0 Default Gateway . . . . . . . . . : 172.18.224.1 DNS Servers . . . . . . . . . . . : 8.8.8.8 NetBIOS over Tcpip. . . . . . . . : Disabled PS C:> docker exec 9e19087e0049 ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : df7f369331b5 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No

Ethernet adapter vEthernet (test_endpoint):

Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2 Physical Address. . . . . . . . . : 00-15-5D-B9-31-6F DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes Link-local IPv6 Address . . . . . : fe80::a44d:c463:592a:3de0%17(Preferred) IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.240.0 Default Gateway . . . . . . . . . : 172.18.224.1 NetBIOS over Tcpip. . . . . . . . : Disabled


As you can see, the first container has the DNS server set, while the second container created does not have the DNS server set, even though the rest is the same.

From the host, all looks fine for the network compartment 2, it has the DNS Server set:

PS C:> ipconfig /allcompartments /all

Windows IP Configuration

============================================================================== Network Information for Compartment 1 (ACTIVE)

============================================================================== Network Information for Compartment 2 ============================================================================== Host Name . . . . . . . . . . . . : WIN-DTNBHI787ER Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : localdomain Ethernet adapter vEthernet (test_endpoint): Connection-specific DNS Suffix . : localdomain Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2 Physical Address. . . . . . . . . : 00-15-5D-B9-31-6F DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes Link-local IPv6 Address . . . . . : fe80::a44d:c463:592a:3de0%17(Preferred) IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.240.0 Default Gateway . . . . . . . . . : 172.18.224.1 DNS Servers . . . . . . . . . . . : 8.8.8.8 NetBIOS over Tcpip. . . . . . . . : Disabled ``` Expected output: The second container should have the same DNS server as the first one.
feiskyer commented 6 years ago

By the way, docker EE preview works, while docker CE won't work.

Docker EE preview could be installed via

Install-Module DockerProvider
Install-Package -Name Docker -ProviderName DockerProvider -RequiredVersion preview
alinbalutoiu commented 6 years ago

@feiskyer have you tested it out and it works?

I have Windows Server 1803 with latest updates, it still doesn't work, you can see the docker version below, installed as you recommended:

PS C:\Users\Administrator> docker version
Client:
 Version:      17.10.0-ee-preview-3
 API version:  1.33
 Go version:   go1.8.4
 Git commit:   1649af8
 Built:        Fri Oct  6 17:52:28 2017
 OS/Arch:      windows/amd64

Server:
 Version:      17.10.0-ee-preview-3
 API version:  1.34 (minimum version 1.24)
 Go version:   go1.8.4
 Git commit:   b8571fd
 Built:        Fri Oct  6 18:01:48 2017
 OS/Arch:      windows/amd64
 Experimental: false
PS C:\Users\Administrator> docker exec 7399822ce02f ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : 7399822ce02f
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : localdomain

Ethernet adapter vEthernet (test_endpoint):

   Connection-specific DNS Suffix  . : localdomain
   Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2
   Physical Address. . . . . . . . . : 00-15-5D-B9-31-DD
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::895:69fb:8dac:8cdf%17(Preferred)
   IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.240.0
   Default Gateway . . . . . . . . . : 172.18.224.1
   DNS Servers . . . . . . . . . . . : 8.8.8.8
   NetBIOS over Tcpip. . . . . . . . : Disabled
PS C:\Users\Administrator> docker exec 54d5d989df89 ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : 7399822ce02f
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No

Ethernet adapter vEthernet (test_endpoint):

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Hyper-V Virtual Ethernet Adapter #2
   Physical Address. . . . . . . . . : 00-15-5D-B9-31-DD
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::895:69fb:8dac:8cdf%17(Preferred)
   IPv4 Address. . . . . . . . . . . : 172.18.224.10(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.240.0
   Default Gateway . . . . . . . . . : 172.18.224.1
   NetBIOS over Tcpip. . . . . . . . : Disabled
PS C:\Users\Administrator> docker ps
CONTAINER ID        IMAGE                       COMMAND             CREATED             STATUS              PORTS               NAMES
54d5d989df89        microsoft/nanoserver:1803   "cmd"               56 seconds ago      Up 54 seconds                           vigorous_bohr
7399822ce02f        microsoft/nanoserver:1803   "cmd"               3 minutes ago       Up 3 minutes                            cocky_brown
PS C:\Users\Administrator>

The second container still doesn't have the DNS set. Again, this is outside of kubernetes, you have all the steps on how to reproduce in the first message. Any other ideas? Thanks!

madhanrm commented 6 years ago

Call Attach-HNSEndpoint -ContainerID $cid2 -CompartmentID $cmp -EndpointID $eid, again with the new containerId to complete the workflow required for sharing endpoints across 2 containers.

When network is managed by docker (when you do not use none), you can see a second HotAttachEndpoint (https://github.com/Microsoft/hcsshim/blob/master/hnsendpoint.go#L38) being invoked. For every attach call on an endpoint with a container ID, HNS replicates the necessary registry keys to the container, which would include the DNS Server settings as well.

This is a requirement if someone is managing network outside of docker or if cni is used. You can have a look at the PR (https://github.com/containernetworking/plugins/pull/85).

Let me know if you have more questions.

alinbalutoiu commented 6 years ago

That is an issue then. This commit makes kubelet to call only once to get the POD IP and not for every container: https://github.com/kubernetes/kubernetes/pull/64189 I also explicitly asked in this comment (https://github.com/kubernetes/kubernetes/pull/64189#issuecomment-391631585) about the reason of calling for every single container but it seemed that it should not be the case for that.

I did not see any issue since there was also someone else from the azure-cni team complaining about multiple add/del requests: https://github.com/kubernetes/kubernetes/issues/57253 so that PR fixes some of their issue, but apparently that generates other issues.

They also commented in the code that it is a temporary workaround: https://github.com/Azure/azure-container-networking/blob/master/cni/network/network_windows.go#L15-L19

In the end, in kubelet the call to "ADD" will be replaced with "GET" when the new CNI version gets released.

For the docker thing you pasted the link to HotAttachEndpoint from hcsshim not from docker.

I don't really understand how everything else is shared (IP/Gateway/MAC etc.) but the DNS is not. And that happens without a second call to HotAttachEndpoint. What does "necessary registry keys" include besides DNS? Why HNS requires a second call to HotAttachEndpoint and cannot do it from the first time?

dineshgovindasamy commented 6 years ago

@alinbalutoiu @madhanrm

For 1709 and 1803 we didnt have the Namespace support in Windows which we are bringing in RS5. So we need this second call to replicate the DNS registry key to the Workload container. This is a platform requirement and removing this will break DNS for the workload containers. IP/Gateway/MAC is set separately than DNS.

In Rs5 once we move the namespace model, we will update CRI and CNI to add endpoints and add it to the namespace.

Can we revert the other PR? This will break Windows Container DNS.

daschott commented 6 years ago

@alinbalutoiu is this still an issue after the revert? If not, can we close it, or are you waiting for the new namespace support in RS5?

alinbalutoiu commented 6 years ago

@daschott This is still an issue on Windows which was discovered after the commit which made kubelet not to call the CNI for every container. I haven't tested on RS5, but it should be fixed if it is not required to explicitly call Attach-HNSEndpoint for the new containers after they are created in the same namespace as the original container.

Would it be possible to have a list with known issues for each Windows version?

michmike commented 6 years ago

we discussed this in sig-windows today and would love an update from @dineshgovindasamy or @daschott

daschott commented 6 years ago

@alinbalutoiu @michmike @dineshgovindasamy If I understand correctly, the main concern is this: "I should not have to call Attach-HNSEndpoint for new containers in the same namespace because CNI spec states there should only be one call".

The response is here. Namespaces are not a first-class citizen on any Windows version prior to Windows Server 2019. We operate on an endpoint basis, and we need an additional CNI call to replicate DNS registry keys. If you remove this call, then it becomes an issue as the new container won't have any DNS registry keys set.

We are doing work right now to enable containerD support on Windows Server 2019, which would allow us to remove the multiple calls. However, as long as Windows Server version 1803 is supported on Kubernetes we cannot remove the CNI calls without adding conditionals for Windows versions to the CNI workflows. According to our devs, keeping the current code is the lesser of two evils in terms of maintainability & code complexity.

Are there any scenarios or features which are broken as a result of the current code?

dantingl commented 5 years ago

I hit this problem on Windows 1809. And Attach-HNSEndpoint does not work.


Invoke-HNSRequest : @{Error=The parameter is incorrect. ; ErrorCode=2147942487; Success=False}                                  
+     return Invoke-HNSRequest -Method POST -Type endpoints -Data (Conv ...                                                     
+            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                         
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException                                              
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Invoke-HNSRequest   ```