Azure / acs-engine

WE HAVE MOVED: Please join us at Azure/aks-engine!
https://github.com/Azure/aks-engine
MIT License
1.03k stars 560 forks source link

Support Windows Server 2019 and make it default #4299

Closed PatrickLang closed 5 years ago

PatrickLang commented 5 years ago

What this PR does / why we need it:

Windows Server 2019 was released on Azure a few weeks ago. This PR adds support for 2019 and makes it the default

This also updates the tests to work on 1803 and 1809/2019.

If applicable:

codecov[bot] commented 5 years ago

Codecov Report

Merging #4299 into master will increase coverage by 0.02%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4299      +/-   ##
==========================================
+ Coverage   55.43%   55.45%   +0.02%     
==========================================
  Files         109      109              
  Lines       16050    16053       +3     
==========================================
+ Hits         8897     8902       +5     
+ Misses       6369     6368       -1     
+ Partials      784      783       -1
PatrickLang commented 5 years ago

Well, I didn't break Linux :)

PatrickLang commented 5 years ago

The clusters seem to deploy ok.

$ kubectl get nodes -o json
2018/11/28 00:51:50 NAME                        STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                    KERNEL-VERSION   CONTAINER-RUNTIME
4510k8s010                  Ready     agent     57s       v1.12.2   <none>        Windows Server Datacenter   10.0.17763.107
                            docker://18.9.0
4510k8s011                  Ready     agent     1m        v1.12.2   <none>    Windows Server Datacenter   10.0.17763.107
                            docker://18.9.0
k8s-linuxpool1-45102031-0   Ready     agent     4m        v1.12.2   <none>    Ubuntu 16.04.5 LTS   4.15.0-1030-azure   docker://3.0.1
k8s-master-45102031-0       Ready     master    4m        v1.12.2   <none>    Ubuntu 16.04.5 LTS   4.15.0-1030-azure   docker://3.0.1

Adding better error logging so I can fix the tests I broke.

PatrickLang commented 5 years ago

Pods start, services work

patrick@planglx1:~/win19$ kubectl get pod -o wide -w
NAME                        READY   STATUS              RESTARTS   AGE   IP       NODE         NOMINATED NODE
iis-2019-5d6f6569d7-7g4sg   0/1     ContainerCreating   0          81s   <none>   3801k8s001   <none>
iis-2019-5d6f6569d7-b4z9c   0/1     ContainerCreating   0          5s    <none>   3801k8s000   <none>
iis-2019-5d6f6569d7-7g4sg   1/1   Running   0     4m57s   10.240.0.41   3801k8s001   <none>
iis-2019-5d6f6569d7-b4z9c   1/1   Running   0     4m51s   10.240.0.19   3801k8s000   <none>

patrick@planglx1:~/win19$ kubectl exec -t iis-2019-5d6f6569d7-7g4sg curl http://10.240.0.19
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>IIS Windows Server</title>
...

patrick@planglx1:~/win19$ kubectl exec -t iis-2019-5d6f6569d7-b4z9c curl http://10.240.0.41
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>IIS Windows Server</title>
...

patrick@planglx1:~/win19$ kubectl get svc
NAME         TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)        AGE
iis          LoadBalancer   10.0.205.55   13.66.132.197   80:30453/TCP   11m
kubernetes   ClusterIP      10.0.0.1      <none>          443/TCP        59m
^C
patrick@planglx1:~/win19$ curl http://13.66.132.197
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>IIS Windows Server</title>
...
jsturtevant commented 5 years ago

I believe the windows e2e tests use this file for cluster creation: https://github.com/Azure/acs-engine/blob/master/examples/e2e-tests/kubernetes/windows/hybrid/definition.json

Which is why the tests are failing, as it doesn't have a sku. Will want to make sure to default a sku if not provided in the cluster definition json.

PatrickLang commented 5 years ago

Thanks @jsturtevant - will look into that. It looks like the SKU is currently defined in the ARM template which affects the deployment, but isn't known by the acs-engine tests.

PatrickLang commented 5 years ago

Looks like I'm on a good track. I need to schedule a test pass on Windows Server version 1803 to make sure I didn't break those tests.

------------------------------
Azure Container Cluster using the Kubernetes Orchestrator with a windows agent pool 
  should be able to deploy an iis webserver
  /go/src/github.com/Azure/acs-engine/test/e2e/kubernetes/kubernetes_test.go:972
STEP: Creating a deployment with 1 pod running IIS
2018/11/28 22:12:28 $ kubectl run iis-kubernetes-southcentralus-28507-87241 -n default --image mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019 --port 80 --hostport -1 --overrides { "spec":{"template":{"spec": {"nodeSelector":{"beta.kubernetes.io/os":"windows"}}}}}
2018/11/28 22:12:29 #### $ kubectl run iis-kubernetes-southcentralus-28507-87241 -n default --image mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019 --port 80 --hostport -1 --overrides { "spec":{"template":{"spec": {"nodeSelector":{"beta.kubernetes.io/os":"windows"}}}}} completed in 340.204626ms

The --image mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019 means it figured out the right OS version.

jackfrancis commented 5 years ago

/lgtm

PatrickLang commented 5 years ago

This is a problem - looking into it

$ kubectl apply -f workloads/iis-azurefile.yaml
.
STEP: Checking that the pod can access volume

$ kubectl exec iis-azurefile -n default -- powershell mkdir -force mnt\azure\testdirectory
2018/11/28 22:46:53 Error trying to run 'kubectl exec':error: unable to upgrade connection: container not found ("iis-azurefile")

It fails with

• Failure [573.207 seconds]
Azure Container Cluster using the Kubernetes Orchestrator
/go/src/github.com/Azure/acs-engine/test/e2e/kubernetes/kubernetes_test.go:79
  with a windows agent pool
  /go/src/github.com/Azure/acs-engine/test/e2e/kubernetes/kubernetes_test.go:971
    should be able to attach azure file [It]
    /go/src/github.com/Azure/acs-engine/test/e2e/kubernetes/kubernetes_test.go:1245

    Expected
        <bool>: false
    to be true

    /go/src/github.com/Azure/acs-engine/test/e2e/kubernetes/kubernetes_test.go:1277
PatrickLang commented 5 years ago

🤦‍♂️ // BUG: this should support OS versioning

PatrickLang commented 5 years ago

Alright, think I found the last reference to Windows OS version

ci/circleci: k8s-windows-1.11-release-e2e — Your tests passed on CircleCI!

PatrickLang commented 5 years ago

/hold Still waiting on results from windows-1803-after-2019pr Jenkins job to make sure 1803 is still passing tests

PatrickLang commented 5 years ago

/remove hold

Tests seem ok on 1803!

Ran 22 of 32 Specs in 1479.677 seconds SUCCESS! -- 22 Passed | 0 Failed | 0 Pending | 10 Skipped

jackfrancis commented 5 years ago

@PatrickLang did I mess up the iis scale tests during rebase? they seem to be taking a long time...

jackfrancis commented 5 years ago

nevermind, these really do take a long time :)

Thanks for seeing this through @PatrickLang!