deis / builder

Git server and application builder for Deis Workflow
https://deis.com
MIT License
40 stars 41 forks source link

feat(race): waitforpod errors out only for timeout #198

Closed smothiki closed 8 years ago

smothiki commented 8 years ago

This doesn't error out if POD is not present. Only error condition is timeout

smothiki commented 8 years ago

This is a short term fix and needs testing to check if it is actually working or not.

smothiki commented 8 years ago

@mboersma @slack you can use smothiki/builder:v2.1 image to test

smothiki commented 8 years ago

As race condition is intermittent manual testing doesn't help, some how we have to observe if the error occurs over a period of time

jchauncey commented 8 years ago

So far so good on this fix

slack commented 8 years ago

Testing now!

slack commented 8 years ago

Hit timeout:

[master 707444d] Bump
Counting objects: 523, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (518/518), done.
Writing objects: 100% (523/523), 54.00 KiB | 0 bytes/s, done.
Total 523 (delta 476), reused 0 (delta 0)
remote: ---> 2016/02/23 22:17:35 Error running git receive hook [watching events for builder pod startup (timed out waiting for the condition)]
Starting build... but first, coffee!
To ssh://git@deis.beef.slack.io:2222/earthy-instinct.git
 * [new branch]      master -> master

real    5m1.447s
user    0m0.050s
sys 0m0.052s
slack commented 8 years ago

Continuing to deploy, we'll see if this crops up again.

smothiki commented 8 years ago

@slack BUILDER_POD_TICK_DURATION env variable you can increase timeout, default is 100. Let me know if you think 100 is too aggressive I can set the default to a higher value

smothiki commented 8 years ago

fixes https://github.com/deis/builder/issues/172

smothiki commented 8 years ago

Increased the default timeout to 300 seconds to aid tests

technosophos commented 8 years ago

I'd vote to merge this now so we can get people unblocked. I reviewed the code, and it looks good. We can continue to tweak timeouts on a subsequent PR if necessary.