Question : how to upgrade a cluster in Azure ACS ?

AlexGrs commented 7 years ago

Now that the recommended way of deploying a cluster is by using ACS, is there a recommended way to upgrade an existing kubernetes cluster ?

Right now, all the cluster I deployed were in 1.4.6 but I would like to benefit from the great work you made with 1.5 in azure.

colemickens commented 7 years ago

Right now, you would need to perform the upgrade manually. In the coming months we will be building managed updates in the ACS service.

If you wanted to upgrade manually, you'd want to SSH to each node, edit /etc/systemd/system/kubelet.service and change the referenced hyperkube image version. (And possibly upgrade kubectl since it's installed on the nodes (or at least the master node). And then just reboot the node, possibly draining it first if you want to be diligent, etc.

AlexGrs commented 7 years ago

Ok great. I will try this. Do you know if there are any plan to add 1.5 to ACS any soon to replace 1.4.6 ?

colemickens commented 7 years ago

We're in progress on it, but are holding off until after New Year, due to deployment "no fly zones" for the holidays.

AlexGrs commented 7 years ago

Great. I made the update on my master node to 1.5.1, it went flawlessly. I will automate it with a fabric script for the time being. thanks for the support !

AlexGrs commented 7 years ago

Hm. After updating the file /systemd/system/kubelet.service to use 1.5.1 and rebooted the node, the server is still in 1.4.6:

kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.6", GitCommit:"e569a27d02001e343cb68086bc06d47804f62af6", GitTreeState:"clean", BuildDate:"2016-11-12T05:16:27Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

colemickens commented 7 years ago

@AlexGrs Oh my mistake, you basically just upgraded kubelet on the master, but not the static pod manifests that kubelet runs (which includes apiserver).

You'll need to make the same 1.4.6->1.5.1 replacement inside the files in /etc/kubernetes/manifests/. You'll likely want to check the static pod manifests in /etc/kubernetes/addons/ as well, kube-proxy for example, ought to be bumped up as well.

kim0 commented 7 years ago

@AlexGrs .. I just went through a similar upgrade. After the systemd kubelet, grep -R v1.4 /etc/kubernetes usually helps see where changes are needed .. Then sed is your friend .. a la sed -i -e "s@v1.4.6@v1.5.1@g" .. Node should be drained first. Masters should just be rebooted. Good luck!

phimar commented 7 years ago

Just made my cluster work by using the https://github.com/Azure/acs-engine and hacking the version 1.5.1 into the generated template just before deployment. Finally, persistent volumes are working flawlessly.

@colemickens Any updates on kubernetes 1.5.1 as default version for deployments via Azure Container Service?

otaviosoares commented 7 years ago

@phimar I did the same.

Finally, persistent volumes are working flawlessly.

Really? I haven't tested in 1.5.1 yet. I'm going to take a look again.

Cheers.

phimar commented 7 years ago

@otaviosoares Yep, it's working. It is as simple as configuring a storage class for azureDisk and a persistent volume claim for your deployment. The vhds are created and mounted to the correct agent automatically.

theobolo commented 7 years ago

@phimar Yep but still there's that problem with Disks over 30gb.

colemickens commented 7 years ago

@phimar We're working on getting it rolled out. No firm ETA beyond end of January, but it should be sooner.

@theobolo what problem?

theobolo commented 7 years ago

@colemickens mkfs.ext4 that's taking hours when kubelet try to format a new PersistentVolume.

https://github.com/kubernetes/kubernetes/pull/38865 https://github.com/kubernetes/kubernetes/issues/30752

If i want to mount a 500gb PersistentVolume ... it's taking something like 1 hour.

AlexGrs commented 7 years ago

I'm planning to move our application to kubernetes on azure. Regarding the issue you mentioned @theobolo : does it mean that if I add a 128Go persistent disk, my deployment will forever to finish ? At least the first time ? Do you have a workaround for this?

colemickens commented 7 years ago

Preformat the disk and you'll avoid that issue. There are patches in flight to tweak the flags to mkfs to try to avoid the issue entirely.

Also, if you just use the dynamic disk provisioning feature, you won't hit this issue, which is much easier to do than manually creating the VHD anyway.

theobolo commented 7 years ago

@colemickens I'm using the Dynamic Disk Provisionning feature with a PVC and an "Azure" StorageClass using a Premium Storage; tried with a Classic POD and with a StatefulSet using PVCTemplates.

But still if i decide to Claim a 500gb volume it's taking more than 1 hour (i'm actually validating my words by deploying a Mongo StatefulSet with a 500gb volumes per instances using my AzureStorageClass)

Not seems that using the k8s Dynamic Provisioning is solving that issue since kubelet is still trying to format any new empty disks just provisioned, that's why i'm saying that.

kube-controller logs when mounting the first instance with the disk :

2017-01-09T10:48:20.820294292Z I0109 10:48:20.819950       1 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd" to node "k8s-agentpool1-35197013-3"
2017-01-09T10:48:20.938835753Z I0109 10:48:20.938470       1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd" (spec.Name: "pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18") from node "k8s-agentpool1-35197013-3".
2017-01-09T10:48:21.030661411Z I0109 10:48:21.030141       1 node_status_updater.go:135] Updating status for node "k8s-agentpool1-35197013-3" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"1\",\"name\":\"kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd\"}]}}" VolumesAttached: [{kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd 1}]
2017-01-09T10:48:35.378882802Z I0109 10:48:35.378509       1 pet_set.go:324] Syncing StatefulSet default/mongo with 1 pods
2017-01-09T10:48:35.380575953Z I0109 10:48:35.380299       1 pet_set.go:332] StatefulSet mongo blocked from scaling on pod mongo-0

and the related kubelet logs after 1hour :

E0109 11:51:47.493407    4592 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
E0109 11:51:47.493677    4592 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd\"" failed. No retries permitted until 2017-01-09 11:53:47.49364695 +0000 UTC (durationBeforeRetry 2m0s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd" (spec.Name: "pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18") pod "9d10dd78-d658-11e6-a3e7-000d3ab4db18" (UID: "9d10dd78-d658-11e6-a3e7-000d3ab4db18") with: mount failed: exit status 1
Mounting command: mount
Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd ext4 [defaults]
Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/coursier-preprod-dynamic-pvc-9d1069a0-d658-11e6-a3e7-000d3ab4db18.vhd in /etc/fstab

And the Kubernetes Dashboard waiting the disk :

And the PersistentVolumes :

After 1 hour i'm still waiting my 1st Mongo Pod .... I'm missing something ?

colemickens commented 7 years ago

Sorry, @theobolo you're correct, I got wires crossed. I pinged the relevant PR again last night to try to push it along. I'll escalate it in the next two days if it doesn't pick up momentum.

theobolo commented 7 years ago

@colemickens Merci Cole ;)

AlexGrs commented 7 years ago

When your PR is merged, will it be available in the next Kubernetes Release ? I'll try to hack around acs-engine to deploy the cluster to stay as up-to-date as possible with latest improvements on kubernetes for azure

colemickens commented 7 years ago

It's not my PR, but yes, that's generally how it works. You can always build your own release and deploy it with ACS-Engine. Since ACS-Engine does everything with the hyperkube image, you merely need to build it yourself. My dev cycle is usually:

clone kubernetes
export REGISTRY=docker.io/colemickens
export VERSION=some-version
./hack/dev-push-hyperkube.sh

And then after it's done, I can use docker.io/colemickens/hyperkube-amd64:some-version as the hyperkubeSpec with the ACS-Engine output to run my custom build.

AlexGrs commented 7 years ago

Hey @colemickens : you mentioned using dynamic disk provisioning .

I tried to find the relevant documentation about it in the documentation and found only this link

In this example, the DiskURI is mandatory. But with dynamic claim, it should be automatically created right ? Or am I missing something ?

colemickens commented 7 years ago

There is dynamic disk provisioning just like in GCE or AWS. I think documentation is absent. https://github.com/kubernetes/kubernetes/pull/30091

rootfs commented 7 years ago

should be available now by https://github.com/kubernetes/kubernetes.github.io/pull/2039

rootfs commented 7 years ago

https://kubernetes.io/docs/user-guide/persistent-volumes/#azure-disk

AlexGrs commented 7 years ago

I seem to hit a timeout issue when trying to mount a persistent volume :

capture d ecran 2017-01-12 a 15 36 20

I checked and a VHD is indeed created in my storage account. I tried to check logs in the controller, and the mount on host seems working. Here are some logs from kube-controller:

2017-01-12T14:25:54.311786013Z I0112 14:25:54.311577       1 replication_controller.go:322] Observed updated replication controller postgresql. Desired pod count change: 1->1
2017-01-12T14:25:54.341387770Z I0112 14:25:54.341218       1 replication_controller.go:322] Observed updated replication controller postgresql. Desired pod count change: 1->1
2017-01-12T14:25:59.248709444Z I0112 14:25:59.248391       1 operation_executor.go:700] DetachVolume.Detach succeeded for volume "kubernetes.io/azure-disk/kapptivatekuber-dynamic-pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1.vhd" (spec.Name: "pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1") from node "k8s-agentpool-17601863-0".
2017-01-12T14:25:59.257773962Z I0112 14:25:59.257342       1 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/azure-disk/kapptivatekuber-dynamic-pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1.vhd" to node "k8s-agentpool-17601863-0"
2017-01-12T14:27:59.776711601Z I0112 14:27:59.776514       1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/azure-disk/kapptivatekuber-dynamic-pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1.vhd" (spec.Name: "pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1") from node "k8s-agentpool-17601863-0".
2017-01-12T14:27:59.875333383Z I0112 14:27:59.875169       1 node_status_updater.go:135] Updating status for node "k8s-agentpool-17601863-0" succeeded. patchBytes: "{\"status\":{\"volumesAttached\":[{\"devicePath\":\"0\",\"name\":\"kubernetes.io/azure-disk/kapptivatekuber-dynamic-pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1.vhd\"}]}}" VolumesAttached: [{kubernetes.io/azure-disk/kapptivatekuber-dynamic-pvc-e005ea48-d8d1-11e6-a869-000d3a34f8f1.vhd 0}]
2017-01-12T14:32:15.637253775Z W0112 14:32:15.637047       1 reflector.go:319] pkg/controller/garbagecollector/garbagecollector.go:760: watch of <nil> ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [143998/143434]) [144997]
2017-01-12T14:33:44.029235618Z I0112 14:33:44.029030       1 replication_controller.go:541] Too few "default"/"postgresql" replicas, need 1, creating 1
2017-01-12T14:33:44.047561542Z I0112 14:33:44.047416       1 event.go:217] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"default", Name:"postgresql", UID:"06a7037c-d8d3-11e6-a869-000d3a34f8f1", APIVersion:"v1", ResourceVersion:"144311", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: postgresql-tj1z6
2017-01-12T14:33:44.081956887Z I0112 14:33:44.081868       1 replication_controller.go:322] Observed updated replication controller postgresql. Desired pod count change: 1->1
2017-01-12T14:33:44.111482326Z I0112 14:33:44.111365       1 replication_controller.go:322] Observed updated replication controller postgresql. Desired pod count change: 1->1

rootfs commented 7 years ago

@AlexGrs do you have kubelet log on that host?

AlexGrs commented 7 years ago

@rootfs : For the host where the pod is located ? I will check how to ssh on it to check logs.

rootfs commented 7 years ago

@AlexGrs yes, from the host where pod lands.

theobolo commented 7 years ago

@AlexGrs Witch is the size of the PersistentVolume that you wanted to deploy ?

Because as i said if you try to provision a Disk bigger than 30gb it can takes a long time before the POD mount it correctly.

Even if the kube-controller says that the Volume is mounted that's not mean that it's formatted : that's why the error is still here and why your POD is not available,

For exemple if i want to deploy a 30gb disk on a Premium Storage used by a Jenkins POD. The first time it takes something like 15-20min because Kubelet need to do a mkfs.ext4 on that disk before the POD starts. That's why you have that error.

Just wait a little bit or try with a smaller disk ;)

AlexGrs commented 7 years ago

I managed to connect to my agent running this pod. During this time ( nearly 20 minutes), it seems the pod managed to finally mount the disk. I delete the rc and created it again and it worked after 1 or 2 minutes.

The size of the disk is 30Gi . My guess is that it was performing some kind of formatting / verification on my volume for the first time and then don't need to do it the next time. It may be related to @theobolo issue. The more capacity there is for the disk, the more time it will take to mount in a pod.

theobolo commented 7 years ago

@AlexGrs That's exactly the point :

Kubelet is doing a mkfs.ext4 on each new Dynamic Disks
The Kube-controller is just doing a mount between the host and the Azure Storage account

AlexGrs commented 7 years ago

Ah ! That's why. I saw there is a on-going PR for a lazy verification. Can't wait to have this available as we have some hudge database. Is there any workaround for this waiting for the PR ?

theobolo commented 7 years ago

@AlexGrs the only workaround today is as @colemickens said, formatting your Disk manually (you can use his guide https://github.com/colemickens/azure-kubernetes-demo).

I did that to format two 500gb disks used by Jenkins and Nexus Repo, when the disks are preformatted, kubelet won't try to format it and will mount disks in 2-3min maximum.

That's the only way to use big PersistentDisks on Azure (and to reach the P30 perfs on Premium Storage, since disk perfs are indexed on the disk size in Azure).

Last thing, to mount your VHD once is formatted, you can use that :

          volumes:
          - azureDisk:
              diskURI: https://diskurlwithvhd 
              diskName: data-master
            name: some-disk

codablock commented 7 years ago

@AlexGrs A workaround/hack is to use this script: https://gist.github.com/codablock/9b8c3a09b6f725436143da575d23ca45

It is a wrapper script around mkfs.ext4 and removes all lazy init related flags from the mkfs.ext4 call.

to use it:

$ mv /usr/sbin/mkfs.ext4 /usr/sbin/mkfs.ext4.original
$ wget -O /usr/sbin/mkfs.ext4 https://gist.githubusercontent.com/codablock/9b8c3a09b6f725436143da575d23ca45/raw/ed6e604ec71c2230e889b625b85d2986d0e6eb18/mkfs.ext4%2520lazy%2520init%2520hack
$ chmod +x /usr/sbin/mkfs.ext4

I deploy this with Ansible (kargo) right now. It assumes that the hosts mkfs.ext4 is used. I'm not sure how acs-engine deploys kubelet, but I'd expect it to be deployed as regular service and not as containarized kubelet. If this is not the case, the script would have to be put into the hyperkube image (making things complicated).

theobolo commented 7 years ago

@codablock unfortunately kubelet is deployed using Hyperkube image in ACS-engine.

AlexGrs commented 7 years ago

@theobolo : I think you did not finish your answer ;)

@codablock : kubelet is running in a container with ACS engine

codablock commented 7 years ago

Is kubelet run with nsenter on acs-engine?

codablock commented 7 years ago

Just seen the screenshot from theobolo and it looks like it is not run with nsenter. This would mean that you'd somehow must modify the hyperkube image to make the wrapper work. If acs-engine supports specifying a custom hyperkube image, that could be done by extending from the original image, installing the script in it, pushing it to docker hub and then use the custom/modified image.

theobolo commented 7 years ago

@codablock That's possible to use a custom Hyperkube image, seems heavy but it should works ...

AlexGrs commented 7 years ago

It seems it's the approach @colemickens described in his previous post.

codablock commented 7 years ago

@AlexGrs What he describes is a complete rebuild of kubernetes. This would only be required if you would like to do the changes directly in the k8s source tree or if you would like to build and use current master.

EDIT: If you want to do this, then it's better to create a branch based on 1.5.2 and merge in https://github.com/kubernetes/kubernetes/pull/38865 instead of using this hack

AlexGrs commented 7 years ago

@codablock : I will maybe try this waiting for your PR to be merge in master branch.

I tried with a Premium_LRS volume now instead of a Standard_LRS one but even after 30 minutes for a 30Gi disk, it fails to mount.

0112 18:02:42.875630    4473 operation_executor.go:832] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/azure-disk/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd" (spec.Name: "pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1") pod "2ed411fa-d8e9-11e6-a869-000d3a34f8f1" (UID: "2ed411fa-d8e9-11e6-a869-000d3a34f8f1").
E0112 18:02:42.880996    4473 mount_linux.go:119] Mount failed: exit status 1
Mounting command: mount
Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd ext4 [defaults]
Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd in /etc/fstab
E0112 18:02:42.883785    4473 mount_linux.go:391] Could not determine if disk "" is formatted (exit status 1)
E0112 18:02:42.884975    4473 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/azure-disk/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd\"" failed. No retries permitted until 2017-01-12 18:04:42.884230675 +0000 UTC (durationBeforeRetry 2m0s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/azure-disk/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd" (spec.Name: "pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1") pod "2ed411fa-d8e9-11e6-a869-000d3a34f8f1" (UID: "2ed411fa-d8e9-11e6-a869-000d3a34f8f1") with: mount failed: exit status 1
Mounting command: mount
Mounting arguments:  /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd ext4 [defaults]
Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/xxx-dynamic-pvc-4e49581b-d8e5-11e6-a869-000d3a34f8f1.vhd in /etc/fstab

codablock commented 7 years ago

@AlexGrs The PR is merged now. I would however wait for https://github.com/kubernetes/kubernetes/pull/40066 to be merged as well.

kaskavalci commented 7 years ago

I created Kubernetes cluster using Azure portal and it installed 1.5.3 by default. There is no way to upgrade the cluster easily. In order to have up-to-date Kubernetes clusters, we need this feature.

mtbbiker commented 6 years ago

Upgrading in Azure from 1.5.3 to 1.5.7 is simple, Anybody here that have successfully upgraded from 1.5.X to 1.6.X in Azure. I have a problem with the master not starting the kubelet.service correctly if I try this approach. It seems that the --config=/etc/kubernetes/manifests \ parameter in the startup was removed

[Unit]
Description=Kubelet
Requires=docker.service
After=docker.service

[Service]
Restart=always
ExecStartPre=/bin/mkdir -p /var/lib/kubelet
# Azure does not support two LoadBalancers(LB) sharing the same nic and backend port.
# As a workaround, the Internal LB(ILB) listens for apiserver traffic on port 4443 and the External LB(ELB) on port 443
# This IPTable rule then redirects ILB traffic to port 443 in the prerouting chain
ExecStartPre=/bin/bash -c "iptables -t nat -A PREROUTING -p tcp --dport 4443 -j REDIRECT --to-port 443"
ExecStartPre=/bin/sed -i "s|<kubernetesHyperkubeSpec>|gcr.io/google_containers/hyperkube-amd64:v1.6.2|g" "/etc/kubernetes/addons/kube-proxy-daemonset.yaml"
ExecStartPre=/bin/mount --bind /var/lib/kubelet /var/lib/kubelet
ExecStartPre=/bin/mount --make-shared /var/lib/kubelet
ExecStart=/usr/bin/docker run \
  --name=kubelet \
  --net=host \
  --pid=host \
  --privileged \
  --volume=/dev:/dev \
  --volume=/sys:/sys:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/var/lib/docker/:/var/lib/docker:rw \
  --volume=/var/lib/kubelet/:/var/lib/kubelet:shared \
  --volume=/var/log:/var/log:rw \
  --volume=/etc/kubernetes/:/etc/kubernetes:ro \
  --volume=/srv/kubernetes/:/srv/kubernetes:ro \
    gcr.io/google_containers/hyperkube-amd64:v1.5.7 \
      /hyperkube kubelet \
        --api-servers="https://10.240.255.5:443" \
        --kubeconfig=/var/lib/kubelet/kubeconfig \
        --address=0.0.0.0 \
        --allow-privileged=true \
        --enable-server \
        --enable-debugging-handlers \
        --config=/etc/kubernetes/manifests \
        --cluster-dns=10.0.0.10 \
        --cluster-domain=cluster.local \
        --register-schedulable=false \
        --cloud-provider=azure \
        --cloud-config=/etc/kubernetes/azure.json \
        --hairpin-mode=promiscuous-bridge \
        --network-plugin=kubenet \
        --azure-container-registry-config=/etc/kubernetes/azure.json \
        --v=2
ExecStop=/usr/bin/docker stop -t 10 kubelet
ExecStopPost=/usr/bin/docker rm -f kubelet

[Install]
WantedBy=multi-user.target

As it seems that the kubelet is started as a Docker container If I remove the parameter, the scheduler and api-server doesn't start

colemickens / azure-kubernetes-status

Question : how to upgrade a cluster in Azure ACS ? #15