sorintlab / stolon

PostgreSQL cloud native High Availability and more.
https://talk.stolon.io
Apache License 2.0
4.66k stars 447 forks source link

`check function error` when `enableServiceLinks` option set in k8s-1.14.3 #678

Open icefed opened 5 years ago

icefed commented 5 years ago

Submission type

Environment

Kubernetes 1.14.3

Stolon version

v0.13.0-pg9.6

Additional environment information if useful to understand the bug

Add enableServiceLinks: false option to stolon-* PodSpec, Keeper and Proxy print error and can not work.

Expected behaviour you didn't see

Unexpected behaviour you saw

Stolon-Proxy error output

2019-06-26T08:00:26.228Z INFO cmd/proxy.go:302 check function error {"error": "failed to update proxyInfo: update failed: Pod \"stolon-proxy-5f5f8cb9cd-82cmr\" is invalid: spec: Forbidden: pod updates may not change fields other than spec.containers[*].image, spec.initContainers[*].image, spec.activeDeadlineSeconds or spec.tolerations (only additions to existing tolerations)\n{\"Volumes\":[{\"Name\":\"default-token-hs6cs\",\"HostPath\":null,\"EmptyDir\":null,\"GCEPersistentDisk\":null,\"AWSElasticBlockStore\":null,\"GitRepo\":null,\"Secret\":{\"SecretName\":\"default-token-hs6cs\",\"Items\":null,\"DefaultMode\":420,\"Optional\":null},\"NFS\":null,\"ISCSI\":null,\"Glusterfs\":null,\"PersistentVolumeClaim\":null,\"RBD\":null,\"Quobyte\":null,\"FlexVolume\":null,\"Cinder\":null,\"CephFS\":null,\"Flocker\":null,\"DownwardAPI\":null,\"FC\":null,\"AzureFile\":null,\"ConfigMap\":null,\"VsphereVolume\":null,\"AzureDisk\":null,\"PhotonPersistentDisk\":null,\"Projected\":null,\"PortworxVolume\":null,\"ScaleIO\":null,\"StorageOS\":null,\"CSI\":null}],\"InitContainers\":null,\"Containers\":[{\"Name\":\"stolon-proxy\",\"Image\":\"sorintlab/stolon:v0.13.0-pg9.6\",\"Command\":[\"/bin/bash\",\"-ec\",\"exec gosu stolon stolon-proxy\n\"],\"Args\":null,\"WorkingDir\":\"\",\"Ports\":[{\"Name\":\"\",\"HostPort\":0,\"ContainerPort\":5432,\"Protocol\":\"TCP\",\"HostIP\":\"\"},{\"Name\":\"\",\"HostPort\":0,\"ContainerPort\":8080,\"Protocol\":\"TCP\",\"HostIP\":\"\"}],\"EnvFrom\":null,\"Env\":[{\"Name\":\"POD_NAME\",\"Value\":\"\",\"ValueFrom\":{\"FieldRef\":{\"APIVersion\":\"v1\",\"FieldPath\":\"metadata.name\"},\"ResourceFieldRef\":null,\"ConfigMapKeyRef\":null,\"SecretKeyRef\":null}},{\"Name\":\"STPROXY_CLUSTER_NAME\",\"Value\":\"\",\"ValueFrom\":{\"FieldRef\":{\"APIVersion\":\"v1\",\"FieldPath\":\"metadata.labels['stolon-cluster']\"},\"ResourceFieldRef\":null,\"ConfigMapKeyRef\":null,\"SecretKeyRef\":null}},{\"Name\":\"STPROXY_STORE_BACKEND\",\"Value\":\"kubernetes\",\"ValueFrom\":null},{\"Name\":\"STPROXY_KUBE_RESOURCE_KIND\",\"Value\":\"configmap\",\"ValueFrom\":null},{\"Name\":\"STPROXY_LISTEN_ADDRESS\",\"Value\":\"0.0.0.0\",\"ValueFrom\":null},{\"Name\":\"STPROXY_METRICS_LISTEN_ADDRESS\",\"Value\":\"0.0.0.0:8080\",\"ValueFrom\":null},{\"Name\":\"STPROXY_TCP_KEEPALIVE_IDLE\",\"Value\":\"600\",\"ValueFrom\":null},{\"Name\":\"STPROXY_TCP_KEEPALIVE_COUNT\",\"Value\":\"8\",\"ValueFrom\":null},{\"Name\":\"STPROXY_TCP_KEEPALIVE_INTERVAL\",\"Value\":\"75\",\"ValueFrom\":null}],\"Resources\":{\"Limits\":null,\"Requests\":null},\"VolumeMounts\":[{\"Name\":\"default-token-hs6cs\",\"ReadOnly\":true,\"MountPath\":\"/var/run/secrets/kubernetes.io/serviceaccount\",\"SubPath\":\"\",\"MountPropagation\":null,\"SubPathExpr\":\"\"}],\"VolumeDevices\":null,\"LivenessProbe\":null,\"ReadinessProbe\":{\"Exec\":null,\"HTTPGet\":null,\"TCPSocket\":{\"Port\":5432,\"Host\":\"\"},\"InitialDelaySeconds\":10,\"TimeoutSeconds\":5,\"PeriodSeconds\":10,\"SuccessThreshold\":1,\"FailureThreshold\":3},\"Lifecycle\":null,\"TerminationMessagePath\":\"/dev/termination-log\",\"TerminationMessagePolicy\":\"File\",\"ImagePullPolicy\":\"IfNotPresent\",\"SecurityContext\":null,\"Stdin\":false,\"StdinOnce\":false,\"TTY\":false}],\"RestartPolicy\":\"Always\",\"TerminationGracePeriodSeconds\":30,\"ActiveDeadlineSeconds\":null,\"DNSPolicy\":\"ClusterFirst\",\"NodeSelector\":{\"node-role.kubernetes.io/master\":\"\"},\"ServiceAccountName\":\"default\",\"AutomountServiceAccountToken\":null,\"NodeName\":\"n227\",\"SecurityContext\":{\"HostNetwork\":false,\"HostPID\":false,\"HostIPC\":false,\"ShareProcessNamespace\":null,\"SELinuxOptions\":null,\"RunAsUser\":null,\"RunAsGroup\":null,\"RunAsNonRoot\":null,\"SupplementalGroups\":null,\"FSGroup\":null,\"Sysctls\":null},\"ImagePullSecrets\":null,\"Hostname\":\"\",\"Subdomain\":\"\",\"Affinity\":null,\"SchedulerName\":\"default-scheduler\",\"Tolerations\":[{\"Key\":\"node.kubernetes.io/not-ready\",\"Operator\":\"Exists\",\"Value\":\"\",\"Effect\":\"NoExecute\",\"TolerationSeconds\":300},{\"Key\":\"node.kubernetes.io/unreachable\",\"Operator\":\"Exists\",\"Value\":\"\",\"Effect\":\"NoExecute\",\"TolerationSeconds\":300}],\"HostAliases\":null,\"PriorityClassName\":\"\",\"Priority\":0,\"DNSConfig\":null,\"ReadinessGates\":null,\"RuntimeClassName\":null,\"EnableServiceLinks\":\n\nA: true}\n\nB: false}\n\n"}

Steps to reproduce the problem

harmjanblok commented 5 years ago

Any update on this issue, we seem to have a similar / the same problem.

2019-09-20T14:03:36.637Z        ERROR   cmd/sentinel.go:1857    cannot update sentinel info     {"error": "update failed: Pod \"stolon-sentinel-5d58cc4ffc-hr66w\" is invalid: spec: Forbidden: pod updates may not change fields other than ``spec.containers[*].image`,` `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)

On kubernetes 1.12.8 with stolon v0.11 we didn't see this error.

sgotti commented 5 years ago

@icefed I haven't tried with enableServiceLinks set to false. I'm not sure how this is related. Perhaps an issue in the k8s client version we are using.

@harmjanblok Are you too disabling enableServiceLinks?

icefed commented 5 years ago

Remove enableServiceLinks then deploy stolon is fine, mybe we can update k8s client version and try again.

And enableServiceLinks added sine kubernetes 1.13 https://github.com/kubernetes/kubernetes/pull/68754

harmjanblok commented 5 years ago

No I don't specify anything for enableServiceLinks, pods have assigned the default (enableServiceLinks: true).

harmjanblok commented 5 years ago

In my setup it seems to fail because it want to set ProcMount: null, while it currently uses the default ProcMount: "Default" (ProcMount is inside the pod SecurityContext). I'm still working on a reproducable setup which I can share.

icefed commented 5 years ago

ProcMount added to SecurityContext since kubernetes 1.12 https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.12.md#new-features

sgotti commented 5 years ago

@harmjanblok If you haven't changed enableServiceLinks to false your issue is different than the one originally reported. Can you open a new issues with the steps to reproduce it (with the steps to reproduce it in a simple k8s environment like minikube)?

harmjanblok commented 5 years ago

@icefed thanks for the link, I'll investigate further, my issue seems to be caused by a bug introduced with this change: https://github.com/kubernetes/kubernetes/pull/78881 @sgotti yes it seems to be a different issue. Originally thought both might related to the used client-go libraries, hence I commented in this thread.

harmjanblok commented 5 years ago

In https://github.com/harmjanblok/stolon/commit/fbfaf9265535ee99d992509a6c806067888a0499 I've pushed the test setup I was using to reproduce the issue. I've been able to reproduce both issues (enableServiceLinks and procMount), although the procMount was caused by a kubernetes bug.

Steps to reproduce:

cd test-kubernetes
kind create cluster --config kind-config.yml
export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
kubectl apply -f k8s/
kubectl exec -it deploy/stolon-sentinel -- stolonctl init --cluster-name kube-stolon --store-backend kubernetes --kube-resource-kind configmap
kubectl logs deploy/stolon-sentinel -f
...
kind delete cluster