GoogleCloudPlatform / marketplace-k8s-app-example

Apache License 2.0
23 stars 19 forks source link

make app/verify fails for nginx when deletion happens too fast #83

Open moxious opened 6 years ago

moxious commented 6 years ago

In deploy.sh, it calls stop.sh (which affects the actual delete of the application) and then goes into a timeout loop waiting on wait_for_deletion.sh

When I run this locally, it just loops around never deleting until the timeout expires, and then app/verify fails. This looks to be because the delete is too quick. :)

Notice the key line: application "nginx-1" deleted which occurs before the timeout script gets started.

I believe this is harmless, because the intent of the script is still satisfied (stuff got deleted) but to the user it looks like failure and it takes some log digging to figure out what happened.

Log:

INFO Tester 'Pod/nginx-1-tester' succeeded
INFO Skip 'ConfigMap/nginx-1-test'
+ clean_iam_resources.sh
+ [[ -z nginx-1 ]]
+ [[ -z apptest-d1e24a01-b34d-4873-92c1-9acbc7c42c8a ]]
+ kubectl delete --namespace=apptest-d1e24a01-b34d-4873-92c1-9acbc7c42c8a --filename=-
serviceaccount "nginx-1-deployer-sa" deleted
rolebinding "nginx-1-deployer-rb" deleted
INFO Stop the application
application "nginx-1" deleted
INFO Wait for the applications to be deleted
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
ERROR Some applications where not deleted
INFO Collecting events for namespace "apptest-d1e24a01-b34d-4873-92c1-9acbc7c42c8a"
LAST SEEN   FIRST SEEN   COUNT     NAME                                        KIND         SUBOBJECT                   TYPE      REASON                   SOURCE                                        MESSAGE
4m          4m           1         nginx-1-5999cf6dd8-vbznb.15395086600c13a4   Pod                                      Normal    Scheduled                default-scheduler                             Successfully assigned nginx-1-5999cf6dd8-vbznb to gke-lab-default-pool-91a532ec-1bwh
4m          4m           1         nginx-1-5999cf6dd8-vbznb.153950866b7d8350   Pod                                      Normal    SuccessfulMountVolume    kubelet, gke-lab-default-pool-91a532ec-1bwh   MountVolume.SetUp succeeded for volume "default-token-z6fhh" 
4m          4m           1         nginx-1-5999cf6dd8-vbznb.15395086898d1d97   Pod          spec.containers{nginx}      Normal    Pulling                  kubelet, gke-lab-default-pool-91a532ec-1bwh   pulling image "gcr.io/neo4j-k8s-marketplace-public/nginx:latest"
3m          3m           1         nginx-1-5999cf6dd8-vbznb.15395088c6273d18   Pod          spec.containers{nginx}      Normal    Pulled                   kubelet, gke-lab-default-pool-91a532ec-1bwh   Successfully pulled image "gcr.io/neo4j-k8s-marketplace-public/nginx:latest"
3m          3m           1         nginx-1-5999cf6dd8-vbznb.15395088c9bd3382   Pod          spec.containers{nginx}      Normal    Created                  kubelet, gke-lab-default-pool-91a532ec-1bwh   Created container
3m          3m           1         nginx-1-5999cf6dd8-vbznb.15395088d43f5bcb   Pod          spec.containers{nginx}      Normal    Started                  kubelet, gke-lab-default-pool-91a532ec-1bwh   Started container
3m          3m           1         nginx-1-5999cf6dd8-vbznb.15395094ed003950   Pod          spec.containers{nginx}      Normal    Killing                  kubelet, gke-lab-default-pool-91a532ec-1bwh   Killing container with id docker://nginx:Need to kill Pod
4m          4m           1         nginx-1-5999cf6dd8.153950865e49d45a         ReplicaSet                               Normal    SuccessfulCreate         replicaset-controller                         Created pod: nginx-1-5999cf6dd8-vbznb
4m          4m           1         nginx-1-deployer-xmp5m.153950851b15572e     Pod                                      Normal    Scheduled                default-scheduler                             Successfully assigned nginx-1-deployer-xmp5m to gke-lab-default-pool-91a532ec-5293
4m          4m           1         nginx-1-deployer-xmp5m.153950852a21f132     Pod                                      Normal    SuccessfulMountVolume    kubelet, gke-lab-default-pool-91a532ec-5293   MountVolume.SetUp succeeded for volume "config-volume" 
4m          4m           1         nginx-1-deployer-xmp5m.153950852c374a48     Pod                                      Normal    SuccessfulMountVolume    kubelet, gke-lab-default-pool-91a532ec-5293   MountVolume.SetUp succeeded for volume "nginx-1-deployer-sa-token-p4v4j" 
4m          4m           1         nginx-1-deployer-xmp5m.153950854178f3b1     Pod          spec.containers{deployer}   Normal    Pulling                  kubelet, gke-lab-default-pool-91a532ec-5293   pulling image "gcr.io/neo4j-k8s-marketplace-public/nginx/deployer:latest"
4m          4m           1         nginx-1-deployer-xmp5m.15395085b5146178     Pod          spec.containers{deployer}   Normal    Pulled                   kubelet, gke-lab-default-pool-91a532ec-5293   Successfully pulled image "gcr.io/neo4j-k8s-marketplace-public/nginx/deployer:latest"
4m          4m           1         nginx-1-deployer-xmp5m.15395085b8207bfb     Pod          spec.containers{deployer}   Normal    Created                  kubelet, gke-lab-default-pool-91a532ec-5293   Created container
4m          4m           1         nginx-1-deployer-xmp5m.15395085c2077d8a     Pod          spec.containers{deployer}   Normal    Started                  kubelet, gke-lab-default-pool-91a532ec-5293   Started container
4m          4m           1         nginx-1-deployer.153950851a1c8d6b           Job                                      Normal    SuccessfulCreate         job-controller                                Created pod: nginx-1-deployer-xmp5m
3m          3m           1         nginx-1-tester.15395092baf07185             Pod                                      Normal    Scheduled                default-scheduler                             Successfully assigned nginx-1-tester to gke-lab-default-pool-91a532ec-5293
3m          3m           1         nginx-1-tester.15395092c2d69dba             Pod                                      Normal    SuccessfulMountVolume    kubelet, gke-lab-default-pool-91a532ec-5293   MountVolume.SetUp succeeded for volume "config-volume" 
3m          3m           1         nginx-1-tester.15395092c53c460c             Pod                                      Normal    SuccessfulMountVolume    kubelet, gke-lab-default-pool-91a532ec-5293   MountVolume.SetUp succeeded for volume "default-token-z6fhh" 
3m          3m           1         nginx-1-tester.15395092e35f5562             Pod          spec.containers{tester}     Normal    Pulling                  kubelet, gke-lab-default-pool-91a532ec-5293   pulling image "gcr.io/neo4j-k8s-marketplace-public/nginx/tester:latest"
3m          3m           1         nginx-1-tester.15395092ed11dbfd             Pod          spec.containers{tester}     Normal    Pulled                   kubelet, gke-lab-default-pool-91a532ec-5293   Successfully pulled image "gcr.io/neo4j-k8s-marketplace-public/nginx/tester:latest"
3m          3m           1         nginx-1-tester.15395092f020925a             Pod          spec.containers{tester}     Normal    Created                  kubelet, gke-lab-default-pool-91a532ec-5293   Created container
3m          3m           1         nginx-1-tester.15395092f8e961ab             Pod          spec.containers{tester}     Normal    Started                  kubelet, gke-lab-default-pool-91a532ec-5293   Started container
3m          3m           1         nginx-1-tester.153950933823bbb8             Pod                                      Normal    SandboxChanged           kubelet, gke-lab-default-pool-91a532ec-5293   Pod sandbox changed, it will be killed and re-created.
3m          3m           1         nginx-1-tester.15395093516a7a86             Pod                                      Warning   FailedCreatePodSandBox   kubelet, gke-lab-default-pool-91a532ec-5293   Failed create pod sandbox.
3m          3m           1         nginx-1-tester.1539509351760cbb             Pod                                      Warning   FailedSync               kubelet, gke-lab-default-pool-91a532ec-5293   Error syncing pod
4m          4m           1         nginx-1.153950865d606c45                    Deployment                               Normal    ScalingReplicaSet        deployment-controller                         Scaled up replica set nginx-1-5999cf6dd8 to 1
INFO Deleting namespace "apptest-d1e24a01-b34d-4873-92c1-9acbc7c42c8a"
namespace "apptest-d1e24a01-b34d-4873-92c1-9acbc7c42c8a" deleted
make: *** [app/verify] Error 1
vcanaa commented 6 years ago

It is interesting how the zero is far from colon. Looks like $res_count is " 0" instead of "0".

Does this repro consistently for you?

moxious commented 6 years ago

3 for 3.

moxious commented 6 years ago

In case something is different here's today's repro

INFO Tester 'Pod/nginx-1-tester' succeeded
INFO Skip 'ConfigMap/nginx-1-test'
+ clean_iam_resources.sh
+ [[ -z nginx-1 ]]
+ [[ -z apptest-307b3cc6-1df8-4b50-8aca-abc25a00a24e ]]
+ kubectl delete --namespace=apptest-307b3cc6-1df8-4b50-8aca-abc25a00a24e --filename=-
serviceaccount "nginx-1-deployer-sa" deleted
rolebinding "nginx-1-deployer-rb" deleted
INFO Stop the application
application "nginx-1" deleted
INFO Wait for the applications to be deleted
INFO Checking if applications were deleted
INFO Remaining:        0
INFO 
INFO Checking if applications were deleted
INFO Remaining:        0
gmflau commented 6 years ago

I witnessed the same problem with v0.2 when using "make app/verify" on my Mac OSX.

moxious commented 6 years ago

In my own testing container, when tests succeed I get this same "fail to detect things were deleted" timeout error. Given other OSX compat issues, that seems like a reasonable place to look for trouble.

moxious commented 6 years ago

Update -- @vcanaa this local patch fixes the issue for me:

$ git diff marketplace/driver/wait_for_deletion.sh
diff --git a/marketplace/driver/wait_for_deletion.sh b/marketplace/driver/wait_for_deletion.sh
index 1255ad0..be55f1b 100755
--- a/marketplace/driver/wait_for_deletion.sh
+++ b/marketplace/driver/wait_for_deletion.sh
@@ -31,7 +31,7 @@ while [[ "$deleted" = "false" ]]; do

   res_count=$(echo $resources | wc -w)

-  if [[ "$res_count" = "0" ]]; then
+  if [ $res_count -eq 0 ]; then
     deleted=true
   else
     # Ignore service account default
vcanaa commented 6 years ago

Thanks moxious, that seems to solve. I am not a bash expert, but this seems to confirm my theory. This execution [[ " 0" -eq 0 ]] && echo hi outputs hi

I will send a fix for that.