kubernetes-sigs / sig-windows-dev-tools

This is a batteries included local development environment for Kubernetes on Windows.
Apache License 2.0
80 stars 46 forks source link

(WIP) proxy: initial code to support kpng #184

Closed dougsland closed 2 years ago

dougsland commented 2 years ago

Before build change sync/shared/variables.yaml -> proxy: kpng and make all

Signed-off-by: Douglas Schilling Landgraf dlandgra@redhat.com

k8s-ci-robot commented 2 years ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dougsland To complete the pull request process, please assign jsturtevant after the PR has been reviewed. You can assign the PR to them by writing /assign @jsturtevant in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/kubernetes-sigs/sig-windows-dev-tools/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
dougsland commented 2 years ago

As expected, kpng is a work in progress for windows nodes. However, controlplane working just fine for now.

vagrant@controlplane:~$ kubectl get pods -A -o wide | grep -i kpng
kube-system   kpng-7hrj8                                 0/2     ImagePullBackOff   0          3m8s   100.244.206.67   winw1          <none>           <none>
kube-system   kpng-k2xnt                                 2/2     Running            0          19m    10.20.30.10      controlplane   <none>           <none>

Some logs (calico CNI):

vagrant@controlplane:~$ kubectl describe pod -n kube-system kpng-7hrj8
Events:
  Type     Reason                  Age                    From               Message
  ----     ------                  ----                   ----               -------
  Normal   Scheduled               9m36s                  default-scheduler  Successfully assigned kube-system/kpng-7hrj8 to winw1
  Warning  FailedCreatePodSandBox  9m9s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "ad6f0a925d78078871849d854809751e6976d77fac2c005f8230c967e7a56d47": cni plugin not initialized
  Warning  FailedCreatePodSandBox  8m58s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b73a1c079d96817b012f46c42b4d178fd70486c2ff0c877a5eb85578398fa336": cni plugin not initialized
  Warning  FailedCreatePodSandBox  8m45s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0c179a6c272bfcbe04fe3c89b18efc2cb697bbb607d9ae7fadb12939ffa8605b": cni plugin not initialized
  Warning  FailedCreatePodSandBox  8m32s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "218bc9a668bace0b6b1e90c28beb174b02b75ff77b1325fcdce8e9853082f176": cni plugin not initialized
  Warning  FailedCreatePodSandBox  8m17s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "74d2aa32f84fabbc7abce5043d7f805cfcca2a573ae1ee83c78db992ba313db2": cni plugin not initialized
  Warning  FailedCreatePodSandBox  8m6s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "30bc010bab106d7b9bbed4c957ef85c75633549e5225062a5cf18fade7dae9bd": cni plugin not initialized
  Warning  FailedCreatePodSandBox  7m54s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5308035b1e4d51bf2e3ae4ef95551a8c9ea02cd36d5faed8a9a384dd6678822b": CreateFile C:\CalicoWindows\libs\calico\..\..\nodename: The system cannot find the file specified.: check that the calico/node container is running and has mounted /var/lib/calico/
  Warning  FailedCreatePodSandBox  7m43s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "1922e961028cf76189eae85497091599ad2270cd723ea0c94f189dce44b845f3": CreateFile C:\CalicoWindows\libs\calico\..\..\nodename: The system cannot find the file specified.: check that the calico/node container is running and has mounted /var/lib/calico/
  Normal   Pulling                 7m9s (x2 over 7m25s)   kubelet            Pulling image "kpng:test"
  Warning  Failed                  7m9s (x2 over 7m23s)   kubelet            Error: ErrImagePull
  Normal   BackOff                 6m57s (x4 over 7m23s)  kubelet            Back-off pulling image "kpng:test"
  Normal   BackOff                 4m6s (x12 over 7m23s)  kubelet            Back-off pulling image "kpng:test"
dougsland commented 2 years ago

/cc @knabben

jayunit100 commented 2 years ago

Maybe a 3 line readme update would suffice if you want to make one:

 # Run a local kpng server process, i.e.  on the linux node:
 wget https://storage.googleapis.com/jayunit100/kpng-2-21
 ./kpng-2-21 kube --kubeconfig=/home/vagrant/.kube/config to-api 

 # Run a windows kpng backend on the windows node: 
 wget https://storage.googleapis.com/jayunit100/kpng-windows-2-21.exe
 ./a.exe local --api=tcp://10.20.30.11:12090 to-winkernel
dougsland commented 2 years ago

Maybe a 3 line readme update would suffice if you want to make one:

 # compile kpng for windows or 
 # wget https://storage.googleapis.com/jayunit100/kpng-windows-2-21.exe onto your windows node... 

 # Run a local kpng server process, i.e. 
 a.exe kube --kubeconfig=C:\etc\kubernetes\kubelet.conf to-api to-winkernel

 # in another terminal, do the same: 
 a.exe kube --kubeconfig=C:\etc\kubernetes\kubelet.conf to-api --listen=unix:///k8s/proxy.sock

Thanks @jayunit100 , going to update this PR now that we have the binary available.

jayunit100 commented 2 years ago

I see several bugs in this when i run kernelspace. does it work for you ?

# Run a local kpng server process, i.e.  on the linux node:
 wget https://storage.googleapis.com/jayunit100/kpng-2-21
 ./kpng-2-21 kube --kubeconfig=/home/vagrant/.kube/config to-api 

 # Run a windows kpng backend on the windows node: 
 wget https://storage.googleapis.com/jayunit100/kpng-windows-2-21.exe
 ./a.exe local --api=tcp://10.20.30.11:12090 to-winkernel
dougsland commented 2 years ago

I see several bugs in this when i run kernelspace. does it work for you ?

# Run a local kpng server process, i.e.  on the linux node:
 wget https://storage.googleapis.com/jayunit100/kpng-2-21
 ./kpng-2-21 kube --kubeconfig=/home/vagrant/.kube/config to-api 

 # Run a windows kpng backend on the windows node: 
 wget https://storage.googleapis.com/jayunit100/kpng-windows-2-21.exe
 ./a.exe local --api=tcp://10.20.30.11:12090 to-winkernel

On top of that, I would also set these vars in the windows side:

$env:KUBECONFIG="C:/etc/kubernetes/kubelet.conf"
$env:KUBE_NETWORK = "Calico"

Still crashing as it's requesting:

51708 sink.go:158] source-vip flag not set 
goroutine 1 [running]: 
k8s.io/klog/v2.stacks(0x1)
        k8s.io/klog/v2@v2.30.0/klog.go:1038 +0x8a
k8s.io/klog/v2.(*loggingT).output(0x28522a0, 0x3, 0x0, 0xc0000c6070, 0x1, {0x1ed8c2a, 0x10}, 0xc0001c2000, 0x0)
        k8s.io/klog/v2@v2.30.0/klog.go:987 +0x5fd
k8s.io/klog/v2.(*loggingT).printDepth(0x0, 0x0, 0x0, {0x0, 0x0}, 0x0, {0xc00008a6c0, 0x1, 0x1})
        k8s.io/klog/v2@v2.30.0/klog.go:735 +0x1ae
k8s.io/klog/v2.(*loggingT).print(...)

See-also: https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-proxy/app/init_windows.go#L41 https://github.com/kubernetes/kubernetes/issues/78338

jayunit100 commented 2 years ago

Hey ! ok Thanks for testing it doug ! Can you try to get these bugs fixed this week ? ...

(i think it might take several weeks , i can look also but happy to give you time to look at them if your on it)

dougsland commented 2 years ago

Hey ! ok Thanks for testing it doug ! Can you try to get these bugs fixed this week ? ...

(i think it might take several weeks , i can look also but happy to give you time to look at them if your on it)

Working on it.

dougsland commented 2 years ago

I see several bugs in this when i run kernelspace. does it work for you ?

# Run a local kpng server process, i.e.  on the linux node:
 wget https://storage.googleapis.com/jayunit100/kpng-2-21
 ./kpng-2-21 kube --kubeconfig=/home/vagrant/.kube/config to-api 

 # Run a windows kpng backend on the windows node: 
 wget https://storage.googleapis.com/jayunit100/kpng-windows-2-21.exe
 ./a.exe local --api=tcp://10.20.30.11:12090 to-winkernel

On top of that, I would also set these vars in the windows side:

$env:KUBECONFIG="C:/etc/kubernetes/kubelet.conf"
$env:KUBE_NETWORK = "Calico"

Another thing, Get-HNSNetwork was not returning any information.

PS C:\Users\vagrant> Get-HNSNetwork
PS C:\Users\vagrant>

Use these commands and started working just fine:

PS> Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
PS> Install-Package -Name Docker -ProviderName DockerMsftProvider
PS> Restart-Computer -Force

After reboot:

PS C:\Users\vagrant> Get-HNSNetwork                                   

ActivityId             : A49F5229-3EEF-4F30-AE65-F7FC37EA9D14 
AdditionalParams       : 
CurrentEndpointCount   : 4
DNSServerCompartment   : 4
DrMacAddress           : 00-15-5D-C2-00-BF
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering Platform}, 
                         @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure VFP Switch Extension}, 
                         @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; Name=Microsoft NDIS Capture}}
Flags                  : 0
Health                 : @{LastErrorCode=0; LastUpdateTime=132904642884322490}
ID                     : 7A6FDEE8-46B9-43B2-AC28-C70FD920FFFB
IPv6                   : False
LayeredOn              : 5204E456-2E38-488C-859A-17273A84E3B9
MacPools               : {@{EndMacAddress=00-15-5D-BF-0F-FF; StartMacAddress=00-15-5D-BF-00-00}}
ManagementIP           : 10.20.30.11
MaxConcurrentEndpoints : 4
Name                   : Calico
Policies               : {@{Type=HostRoute}, @{DestinationPrefix=100.244.49.64/26; DistributedRouterMacAddress=66-a5-cc-c1-86-97; IsolationId=4096;  
                         ProviderAddress=10.20.30.10; Type=RemoteSubnetRoute}}
Resources              : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; ID=A49F5229-3EEF-4F30-AE65-F7FC37EA9D14;       
                         PortOperationTime=0; State=1; SwitchOperationTime=0; VfpOperationTime=0; parentId=7B8A9743-3F3F-4B49-A3E8-5F0F2A3C716F}     
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=100.244.206.64/26; GatewayAddress=100.244.206.65; Health=;  
                         ID=C79807A7-254A-46BA-BB44-63142BB70DA2; ObjectType=5; Policies=System.Object[]; State=0}}      
TotalEndpoints         : 4
Type                   : Overlay
Version                : 38654705669

ActivityId             : 0577F805-2244-4C26-9929-982A47AC6DD6
AdditionalParams       : 
CurrentEndpointCount   : 0
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering Platform}, 
                         @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=False; Name=Microsoft Azure VFP Switch Extension}, 
                         @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; Name=Microsoft NDIS Capture}}
Flags                  : 0
Health                 : @{AddressNotificationMissedCount=0; AddressNotificationSequenceNumber=0; InterfaceNotificationMissedCount=0; 
                         InterfaceNotificationSequenceNumber=0; LastErrorCode=0; LastUpdateTime=132904642622773853; RouteNotificationMissedCount=0;  
                         RouteNotificationSequenceNumber=0}
ID                     : 9FB4AAFB-9601-40B1-B719-522FCEA5ED1D
IPv6                   : False
LayeredOn              : 792ABBD4-27DA-487A-8DF9-4065BF1F7A61
MacPools               : {@{EndMacAddress=00-15-5D-94-FF-FF; StartMacAddress=00-15-5D-94-F0-00}}
MaxConcurrentEndpoints : 0
Name                   : nat
NatName                : ICS5C13BF44-2EDD-4358-8140-5704AB53A264
Policies               : {}
Resources              : @{AdditionalParams=; AllocationOrder=2; Allocators=System.Object[]; Health=; ID=0577F805-2244-4C26-9929-982A47AC6DD6;   
                         PortOperationTime=0; State=1; SwitchOperationTime=0; VfpOperationTime=0; parentId=25DEBFF4-49D2-4708-AEE4-8913D644F7B6} 
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=172.22.240.0/20; GatewayAddress=172.22.240.1; Health=; ID=5AD1BC2D-2C6B-42C9-A379-A37E8CB2704D;  
                         Policies=System.Object[]; State=0}}
TotalEndpoints         : 0
Type                   : nat
Version                : 38654705669

ActivityId             : 0366923C-A137-4259-A893-2CE19AE3858D
AdditionalParams       : 
CurrentEndpointCount   : 0
DNSServerCompartment   : 3
DrMacAddress           : 00-15-5D-C2-00-BF
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering Platform},  
                         @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure VFP Switch Extension},    
                         @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; Name=Microsoft NDIS Capture}}
Flags                  : 0
Health                 : @{LastErrorCode=0; LastUpdateTime=132904642672330036}
ID                     : 76B7183B-7B18-4620-9A71-F239ADEEB965
IPv6                   : False
LayeredOn              : 5204E456-2E38-488C-859A-17273A84E3B9
MacPools               : {@{EndMacAddress=00-15-5D-FD-BF-FF; StartMacAddress=00-15-5D-FD-B0-00}}
ManagementIP           : 10.20.30.11
MaxConcurrentEndpoints : 0
Name                   : External
NetworkAdapterName     : Ethernet
Policies               : {}
Resources              : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; ID=0366923C-A137-4259-A893-2CE19AE3858D;
                         PortOperationTime=0; State=1; SwitchOperationTime=0; VfpOperationTime=0; parentId=7B8A9743-3F3F-4B49-A3E8-5F0F2A3C716F}
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=192.168.255.0/30; GatewayAddress=192.168.255.1; Health=;
                         ID=9815DFE8-AF9A-4FE8-A278-98897F061C8A; ObjectType=5; Policies=System.Object[]; State=0}}
TotalEndpoints         : 0
Type                   : Overlay
Version                : 38654705669

See: https://github.com/Microsoft/SDN/issues/198

jayunit100 commented 2 years ago

Calico didn't install HNS for you already ?

jayunit100 commented 2 years ago

... i think

suggestion