openshift / origin

Conformance test suite for OpenShift
http://www.openshift.org
Apache License 2.0
8.47k stars 4.7k forks source link

oc run --rm leaks resources (error: timed out waiting for condition) #13276

Closed mattf closed 5 years ago

mattf commented 7 years ago

dc, rc and po remain after 'oc run --rm' exits (via error)

Version
oc v1.5.0-alpha.1+71d3fa9
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://api.preview.openshift.com:443
openshift v3.4.1.8
kubernetes v1.4.0+776c994
Steps To Reproduce
$ oc run -it --rm dev-shell --image=radanalyticsio/openshift-spark --
pyspark
error: timed out waiting for the condition
Current Result
$ oc get all -lrun=dev-shell
NAME           REVISION   DESIRED   CURRENT   TRIGGERED BY
dc/dev-shell   1          1         1         config
NAME             DESIRED   CURRENT   READY     AGE
rc/dev-shell-1   1         1         1         2m
NAME                   READY     STATUS    RESTARTS   AGE
po/dev-shell-1-whmmr   1/1       Running   0          42s
Expected Result

'oc run --rm' isn't running, so no resources

Additional Information

some events on the pod...

Events:
  FirstSeen LastSeen Count From SubobjectPathType Reason Message
  --------- -------- ----- ---- --------------------- ------ -------
  1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned
dev-shell-1-deploy to ip-172-31-2-86.ec2.internal
  40s 40s 1 {kubelet ip-172-31-2-86.ec2.internal}
spec.containers{deployment} Warning Failed Failed to pull image
"registry.ops.openshift.com/openshift3/ose-deployer:v3.4.1.8": image
pull failed for
registry.ops.openshift.com/openshift3/ose-deployer:v3.4.1.8, this may
be because there are no credentials on this request.  details:
(net/http: request canceled)
  40s 40s 1 {kubelet ip-172-31-2-86.ec2.internal} Warning FailedSync
Error syncing pod, skipping: failed to "StartContainer" for
"deployment" with ErrImagePull: "image pull failed for
registry.ops.openshift.com/openshift3/ose-deployer:v3.4.1.8, this may
be because there are no credentials on this request.  details:
(net/http: request canceled)"
  1m 15s 2 {kubelet ip-172-31-2-86.ec2.internal}
spec.containers{deployment} Normal Pulling pulling image
"registry.ops.openshift.com/openshift3/ose-deployer:v3.4.1.8"
  12s 12s 1 {kubelet ip-172-31-2-86.ec2.internal}
spec.containers{deployment} Normal Pulled Successfully pulled image
"registry.ops.openshift.com/openshift3/ose-deployer:v3.4.1.8"
  6s 6s 1 {kubelet ip-172-31-2-86.ec2.internal}
spec.containers{deployment} Normal Created Created container with
docker id 1835f1505056; Security:[seccomp=unconfined]
  3s 3s 1 {kubelet ip-172-31-2-86.ec2.internal}
spec.containers{deployment} Normal Started Started container with
docker id 1835f1505056
openshift-bot commented 6 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 6 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

juanvallejo commented 6 years ago

@soltysh my best guess is that the error "timed out waiting for condition" is occurring while waiting for the pod to be created: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/run.go#L341

We could probably place the "remove" logic in a separate function, then do a defer call on it here: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/run.go#L299 after we have a list of created objects to delete. This way, we ensure we delete all objects beyond this point, should an error happen. Thoughts?

soltysh commented 6 years ago

Yeah, that makes sense, go for it.

juanvallejo commented 6 years ago

/remove-lifecycle rotten

juanvallejo commented 6 years ago

Upstream PR: https://github.com/kubernetes/kubernetes/pull/62482

juanvallejo commented 6 years ago

Upstream PR has merged, will be fixed in Origin in next rebase

openshift-bot commented 6 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 6 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 5 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci-robot commented 5 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/origin/issues/13276#issuecomment-421220801): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.