kubernetes-retired / federation

[EOL] Cluster Federation
Apache License 2.0
209 stars 82 forks source link

Federated deployments cannot be deleted #215

Closed irfanurrehman closed 6 years ago

irfanurrehman commented 6 years ago

Issue by jsgoller1 Sunday Oct 08, 2017 at 13:20 GMT Originally opened as https://github.com/kubernetes/kubernetes/issues/53566


Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug /sig cli /sig federation @kubernetes/sig-federation

What happened: I have a federated cluster composed of one host cluster and one joined cluster running in AWS. They were created following the steps in the documentation, and I am able to see via the kubernetes dashboard that all components are running correctly.

To test the federated setup, I used a simple nginx deployment also from the docs, stored in deployment.proxy.yml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: proxy
spec:
  replicas: 5 # tells deployment to run 2 pods matching the template
  template: # create pods using pod definition in this template
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

I then execute kubectl create --context=federation -f deployment.proxy.yml. Assuming I don't change apiVersion, the deployment runs fine and almost immediately reaches the desired count (if I use apiVersion: apps/v1beta2 or any other API version than extensions/ with my federated setup, I get errors and it won't start). I then go to tear down the deployment by executing kubectl delete deployment --context=federation proxy.

However, kubectl reports error: timed out waiting for the condition. The deployment instances spin down to zero, but the deployment won't disappear:

NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
proxy     0         0         0            0           1h

I can delete them via the dashboard, or by switching to joined cluster's context and using kubctl delete deployment proxy, but even after kubectll reports they are deleted successfully, they reappear and still cannot be deleted from the federation context.

What you expected to happen: The deployment to be deleted. This works fine if I'm not using my federated context.

How to reproduce it (as minimally and precisely as possible): Create a federated setup via the docs, launch the above job, and then try to delete it.

Environment:

irfanurrehman commented 6 years ago

Comment by ljmatkins Thursday Oct 26, 2017 at 22:47 GMT


FWIW, I am also seeing this issue when running on a GKE Kubernetes 1.8.1 federation.

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.6", GitCommit:"4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c", GitTreeState:"clean", BuildDate:"2017-09-14T06:55:55Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.1", GitCommit:"f38e43b221d08850172a9a4ea785a86a3ffa3b3a", GitTreeState:"clean", BuildDate:"2017-10-11T23:16:41Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

irfanurrehman commented 6 years ago

Comment by jianhuiz Monday Oct 30, 2017 at 19:56 GMT


This is because that moving federation controllers to sync controller failed to update Status.ObservedGeneration (e.g. https://github.com/kubernetes/kubernetes/blob/release-1.8/federation/pkg/federatedtypes/deployment.go#L55) which is used by kubectl for it to proceed the next step. This bug applies to ReplicaSet, too

irfanurrehman commented 6 years ago

Comment by YaweiWu Wednesday Nov 01, 2017 at 02:22 GMT


+1

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.1", GitCommit:"f38e43b221d08850172a9a4ea785a86a3ffa3b3a", GitTreeState:"clean", BuildDate:"2017-10-11T23:27:35Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:38:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

CentOS Linux release 7.4.1708 (Core)

irfanurrehman commented 6 years ago

Comment by arthur0 Wednesday Nov 01, 2017 at 02:29 GMT


+1 Same for me, using a hybrid environment (OpenStack + GKE). The only way to delete deployments is creating them in another namespace different than the default and delete the namespace itself.

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:48:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:38:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
irfanurrehman commented 6 years ago

Comment by blairccx Thursday Nov 02, 2017 at 20:15 GMT


+1

Happening for me in AWS.



Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.4", GitCommit:"793658f2d7ca7f064d2bdf606519f9fe1229c381", GitTreeState:"clean", BuildDate:"2017-08-17T08:30:51Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}```
irfanurrehman commented 6 years ago

Comment by djay87 Monday Nov 13, 2017 at 13:12 GMT


+1

irfanurrehman commented 6 years ago

Comment by glindste Wednesday Nov 15, 2017 at 10:11 GMT


+1 in GKE with version 1.8.2-gke.0

irfanurrehman commented 6 years ago

Comment by clsacramento Thursday Nov 16, 2017 at 12:04 GMT


+1 GKE host cluster federated with azure. Also +1 with baremetal host + Azure.

irfanurrehman commented 6 years ago

Comment by stephenbm Monday Nov 20, 2017 at 11:08 GMT


+1 on Openstack 1.8.3

Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.3", GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
irfanurrehman commented 6 years ago

Comment by dims Wednesday Nov 22, 2017 at 12:33 GMT


/sig federation

irfanurrehman commented 6 years ago

Comment by dims Wednesday Nov 22, 2017 at 12:34 GMT


/sig multicluster

irfanurrehman commented 6 years ago

Comment by Blasterdick Thursday Nov 30, 2017 at 14:26 GMT


+1 on federated 1.8.3-gke.0 (Google Container Engine)

irfanurrehman commented 6 years ago

Comment by nikhiljindal Friday Dec 01, 2017 at 08:13 GMT


Removing sig/cli label since as Jianhuiz pointed out, this is due to a bug in federation deployment controller and does not require any code change to kubectl.

cc @irfanurrehman This should be moved to the new kubernetes/federation repo?

irfanurrehman commented 6 years ago

Comment by tommyknows Friday Jan 05, 2018 at 08:57 GMT


Bug still persists on k8s 1.9.0, On-Premise. I am able to delete the deployment when executing the following command(s):

kubectl delete deploy [name] --cascade=false --context federation
kubectl delete deploy [name] --context [cluster]

Only deleting the deployment on the [cluster] does not work as the federation controller automatically recreates the deployment. That's why it works when you first delete the deployment on the federation with --cascade=false...

irfanurrehman commented 6 years ago

Comment by Blasterdick Friday Jan 12, 2018 at 09:33 GMT


@tommyknows great thanks, mate, but sure it has to be solved by devs, as it's such a duct tape fix for a project like k8s :)

irfanurrehman commented 6 years ago

cc @jsgoller1

pkutishch commented 6 years ago

+1

kubectl delete deploy nginx
error: timed out waiting for the condition
rarescosma commented 6 years ago

+1

lonefreak commented 6 years ago

+1

ChrisOnToure commented 6 years ago

+1

sara4dev commented 6 years ago

+1

yutongp commented 5 years ago

+1

yutongp commented 5 years ago

still having this issue on current master