Open moxious opened 6 years ago
It is interesting how the zero is far from colon. Looks like $res_count is " 0" instead of "0".
Does this repro consistently for you?
3 for 3.
In case something is different here's today's repro
INFO Tester 'Pod/nginx-1-tester' succeeded
INFO Skip 'ConfigMap/nginx-1-test'
+ clean_iam_resources.sh
+ [[ -z nginx-1 ]]
+ [[ -z apptest-307b3cc6-1df8-4b50-8aca-abc25a00a24e ]]
+ kubectl delete --namespace=apptest-307b3cc6-1df8-4b50-8aca-abc25a00a24e --filename=-
serviceaccount "nginx-1-deployer-sa" deleted
rolebinding "nginx-1-deployer-rb" deleted
INFO Stop the application
application "nginx-1" deleted
INFO Wait for the applications to be deleted
INFO Checking if applications were deleted
INFO Remaining: 0
INFO
INFO Checking if applications were deleted
INFO Remaining: 0
I witnessed the same problem with v0.2 when using "make app/verify" on my Mac OSX.
In my own testing container, when tests succeed I get this same "fail to detect things were deleted" timeout error. Given other OSX compat issues, that seems like a reasonable place to look for trouble.
Update -- @vcanaa this local patch fixes the issue for me:
$ git diff marketplace/driver/wait_for_deletion.sh
diff --git a/marketplace/driver/wait_for_deletion.sh b/marketplace/driver/wait_for_deletion.sh
index 1255ad0..be55f1b 100755
--- a/marketplace/driver/wait_for_deletion.sh
+++ b/marketplace/driver/wait_for_deletion.sh
@@ -31,7 +31,7 @@ while [[ "$deleted" = "false" ]]; do
res_count=$(echo $resources | wc -w)
- if [[ "$res_count" = "0" ]]; then
+ if [ $res_count -eq 0 ]; then
deleted=true
else
# Ignore service account default
Thanks moxious, that seems to solve. I am not a bash expert, but this seems to confirm my theory.
This execution
[[ " 0" -eq 0 ]] && echo hi
outputs
hi
I will send a fix for that.
In deploy.sh, it calls stop.sh (which affects the actual delete of the application) and then goes into a timeout loop waiting on wait_for_deletion.sh
When I run this locally, it just loops around never deleting until the timeout expires, and then app/verify fails. This looks to be because the delete is too quick. :)
Notice the key line:
application "nginx-1" deleted
which occurs before the timeout script gets started.I believe this is harmless, because the intent of the script is still satisfied (stuff got deleted) but to the user it looks like failure and it takes some log digging to figure out what happened.
Log: