Closed grosser closed 9 years ago
Happy to add the exception details, but the Ruby Timeout lib isn't recommended: https://github.com/mperham/sidekiq/wiki/Problems-and-Troubleshooting#writing-thread-safe-code
The proper writeup is: http://www.mikeperham.com/2015/05/08/timeout-rubys-most-dangerous-api
The Timeout bug discussed there only affects persistent connections to datastores.
We are using timeout for a large sidekiq + celluloid installation that runs 10+k jobs per minute and have not seen anything like that and saw 2 different runs-forever cases in this installation (processor too slow to finish an inbox that kept getting refilled + hanging on writing a file) + 1 in the new project I just started (bad code in a loop).
So I think not mentioning timeouts would let more people run into trouble then mentioning them. Updated readme to explain a bit more / link to the article.
I try to keep the documentation for Sucker Punch inline with Sidekiq. If the use of Timeout
isn't recommended in Sidekiq, I'd like to keep the same in Sucker Punch.
Ping @mperham for his thoughts
BTW, the link I posted above links to the same article you linked :) Just easier access from the Wiki since there's other great pointers in there too.
I added the link to the article into the wiki too ;)
On Mon, Aug 10, 2015 at 10:00 AM, Brandon Hilkert notifications@github.com wrote:
BTW, the link I posted above links to the same article you linked :) Just easier access from the Wiki since there's other great pointers in there too.
— Reply to this email directly or view it on GitHub https://github.com/brandonhilkert/sucker_punch/pull/121#issuecomment-129526829 .
@mperham do you think this readme change is good / how would you suggest timing out code that could potentially run forever ?
@grosser You are an exception, an expert Rubyist. My choice was: I can try to educate hundreds or thousands of users on the subtleties and complexities of timeout usage or I can simply proclaim it unsafe for use. I decided on the latter because the former leads to unlimited support costs.
So yeah we can make the default be 'do not use/rely on Timeout', but there is still the problem of having jobs just run forever because of bad code, isn't that equally bad ? An alternative solution could be to crash the whole process on Timeout, so connections get closed / people do not use it for 'normal' workflows / do not open issues about mysterious bugs :trollface:
Here's what Google does in Go to handle Timeouts. tl;dr It's not easy and requires major changes to your API. There's no easy way to handle timeouts in a complex block of code with lots of I/O.
So you'd prefer to advise against timeouts and have people run into endless jobs ?
Said another way, would you prefer mysterious random corruption or explicit problems?
I'd prefer Timeout -> explosion -> I know I have a problem vs Infinite job -> this is slow/queue is stuck/idk what's going on.
Something like:
Using Timeout causes persistent connections to randomly get corrupted, do not use it as control flow, use it as last resort to know something went very wrong. Ideally restart the worker process after every timeout.
I see more reports of timeout corruption than endless jobs but YMMV. Both options are miserable. ¯_(ツ)_/¯
Yeah ... no nice solution :D
I'd like to at least mention it in some way, since it's not very intuitive ... maybe also for the readme of sidekiq ... any of these options look good to you @brandonhilkert ?
Option A:
## Timeouts
Using `Timeout` causes persistent connections to [randomly get corrupted](http://www.mikeperham.com/2015/05/08/timeout-rubys-most-dangerous-api),
do not use it as control flow, use builtin connection timeouts. If you decide to use Timeout,
only use them as last resort to know something went very wrong and
ideally restart the worker process after every timeout.
Option B:
## Timeouts
Do not use `Timeout`, it causes persistent connections to [randomly get corrupted](http://www.mikeperham.com/2015/05/08/timeout-rubys-most-dangerous-api).
Option C:
## Timeouts
Do not use `Timeout` [it's evil](http://www.mikeperham.com/2015/05/08/timeout-rubys-most-dangerous-api).
I like the explanation of A.
_Build a Ruby Gem is available! http://brandonhilkert.com/books/build-a-ruby-gem/?utm_source=gmail-sig&utm_medium=email&utm_campaign=gmail_
On Tue, Aug 11, 2015 at 12:38 AM, Michael Grosser notifications@github.com wrote:
Yeah ... no nice solution :D
I'd like to at least mention it in some way, since it's not very intuitive ... maybe also for the readme of sidekiq ... any of these options look good to you @brandonhilkert https://github.com/brandonhilkert ?
Option A:
Timeouts
Using
Timeout
causes persistent connections to randomly get corrupted, do not use it as control flow, use builtin connection timeouts. If you decide to use Timeout, only use them as last resort to know something went very wrong and ideally restart the worker process after every timeout.Option B:
Timeouts
Do not use
Timeout
, it causes persistent connections to randomly get corrupted.Option C:
Timeouts
Do not use
Timeout
it's evil.— Reply to this email directly or view it on GitHub https://github.com/brandonhilkert/sucker_punch/pull/121#issuecomment-129700910 .
updated the readme
Thanks!
@brandonhilkert