Closed jjb closed 4 months ago
/cc @mohamedhafez @mperham @headius
I showed this to @bensheldon and he had this additional idea
loop do
break if Timeout.time_remaining < 1
do_work
end
my thought, this could be supported either in standard mode or a no-raise mode
Timeout.timeout(5) do # or, timeout_without_raise, or something
loop do
break if Timeout.time_remaining < 1
do_work
end
end
Which version of timeout
are you using?
In the latest version an exception is used to "interrupt" the block, and not throw
anymore.
The exception is either:
https://github.com/ruby/timeout/blob/a7e91a5666c09854e7d9b5e066debeb00a56b344/lib/timeout.rb#L29-L30
or the one passed to Timeout.timeout
.
As mentioned, Timeout::ExitException
is kind of internal and an implementation detail.
The idea is if you want to catch the timeout exception inside the block, then use Timeout.timeout(n, MyException) { ... }
.
Taking your example, why is this not enough?
require 'timeout'
class ShutdownRequest < StandardError; end
def do_work
1000.times do |n|
puts n
sleep 1
rescue ShutdownRequest => e
puts "shuting down gracefully..."
raise e
end
ensure
puts "cleaning up resources..."
sleep 2
puts "...done"
end
Timeout.timeout(3, ShutdownRequest) do
do_work
rescue ShutdownRequest
puts "nice."
end
puts
puts
begin
Timeout.timeout(3, ShutdownRequest) do
do_work
end
rescue ShutdownRequest
puts "nice."
end
The output is:
0
1
2
shuting down gracefully...
cleaning up resources...
...done
nice.
0
1
2
shuting down gracefully...
cleaning up resources...
...done
nice.
This is often not practical because the code we want to time out is many layers of libraries we don't control, and can't accommodate our special shutdown method. If this pattern is indeed safe, maybe the timeout gem could formalize it, by: a. defining its own Shutdown class
Is the issue to do extra logic on Timeout and a far caller does Timeout.timeout(n) { ... }
(so no exception class)?
If so would documenting Timeout::ExitException
and make it non-internal be a solution here?
I think that can be considered.
If it's about having a common exception for both Timeout.timeout(n) { ... }
and Timeout.timeout(n, MyException) { ... }
that seems unfortunately very difficult for compatibility, because I suspect some code might already depend on the second one raising a MyException
inside the block and not just outside.
version: 0.4.1
why is this not enough?
in your new code it looks like you removed ShutdownHandledGracefully
- i did a bad job naming this, it's not central to the proposal, it's a stand-in for something an application might do - a better name might have been Rollback
- however, an even more realistic/better example is what you did with raise e
- the application code raises the "reason" for the interruption, which is indeed the shutdown/exit request. So, bringing that all together to this simplified example, leveraging Timeout::ExitException
:
require 'timeout'
def do_work
1000.times do |n|
puts n
sleep 1
rescue Timeout::ExitException => e
puts "shuting down gracefully..."
raise e
end
ensure
puts "cleaning up resources..."
sleep 2
puts "...done"
end
Timeout.timeout(3) do
do_work
rescue Timeout::Error
puts "process is not in a healthy state, need to kill the process so there aren't broken resources..."
rescue Exception # this is what an error reporter would do
puts "work was not completed, but process is in a healthy state"
end
puts
puts
# give it 1 second to try to shutdown gracefully, then force it
begin
Timeout.timeout(4) do
Timeout.timeout(3) do
do_work
end
end
rescue Timeout::Error
puts "process is not in a healthy state, need to kill the process so there aren't broken resources..."
rescue Exception # this is what an error reporter would do
puts "work was not completed, but process is in a healthy state"
end
So, this is nice because:
a. it's more simple b. leverages existing timeout behavior c. has the overall behavior i want
however, here is some application code that doesn't have the behavior i want - what if other layers of application or library code catch Timeout::ExitException
- now the second one gets caught, and the "hard" limit is not enforced. (i put the new behavior inside the ensure
in order to make the diff between the examples smaller - more realistic would be multiple loops, each with their own rescue and cleanup code which takes time).
require 'timeout'
def do_work
1000.times do |n|
puts n
sleep 1
rescue Timeout::ExitException => e
puts "shuting down gracefully..."
raise e
end
ensure
begin
puts "cleaning up resources..."
sleep 2
puts "...done"
rescue Timeout::ExitException
puts "rescued inside the ensure. phew! now to clean up, for 9000 seconds..."
sleep 9000
end
end
Timeout.timeout(3) do
do_work
rescue Timeout::Error
puts "process is not in a healthy state, need to kill the process so there aren't broken resources..."
rescue Exception # this is what an error reporter would do
puts "work was not completed, but process is in a healthy state"
end
puts
puts
# give it 1 second to try to shutdown gracefully, then force it
begin
Timeout.timeout(4) do
Timeout.timeout(3) do
do_work
end
end
rescue Timeout::Error
puts "process is not in a healthy state, need to kill the process so there aren't broken resources..."
rescue Exception # this is what an error reporter would do
puts "work was not completed, but process is in a healthy state"
end
would documenting Timeout::ExitException and make it non-internal be a solution here? I think that can be considered.
that sounds pretty cool and it's neat to discover i could make the more simple example above- but as shown in the 2nd example above, i think the semantics of what's going on are not sufficient. we need a standalone "please try to shutdown gracefully" signal.
If it's about having a common exception for both Timeout.timeout(n) { ... } and Timeout.timeout(n, MyException) { ... } that seems unfortunately very difficult for compatibility, because I suspect some code might already depend on the second one raising a MyException inside the block and not just outside.
IIUC, i believe your concern is not relevant to my proposal. my proposal would probably require a very different and new style of invocation. (e.g. Timeout.timeout_with_grace_period(soft: 3, hard: 5){ }
from my OP). Libraries could choose to support it, just in case the outer code is using it. This new invocation would send the special Shutdown exception after soft
seconds, then have behavior identical to current Timeout after hard
seconds. The same api features of current Timeout could be supported with keyword arguments.
what if other layers of application or library code catch Timeout::ExitException
Then it's a clear bug of that application/library code. They should never catch Timeout::ExitException.
Just like doing something blocking for a while in ensure
is bad practice and can cause issues/hangs/second Ctrl+C/etc.
I read your whole post but I am still not entirely sure why you want a soft exception and a hard exception.
If the hard exception is caught, that's a bug and there is nothing we can do (well, timeout used to use throw
but that has other problems).
If the soft exception is caught, I think that's also a bug.
I guess the soft/hard is about handling code in rescue
/ensure
which misbehaves and blocks for a while.
But then it's not clear the hard exception wouldn't cause the same problem (i.e. the next rescue/ensure might misbehave too).
And even if it did help, it causes another problem, code in some ensure
/rescue
was interrupted so now we have doubly-inconsistent state (and the second is worse, it means some code in ensure
/rescue
wasn't run).
It feels to me if you want/need a hard timeout, then SIGKILL is the way to go (SIGTERM wouldn't help, it's treated as a SignalException).
(there is also exit
and exit!
but they also depend on rescue
/ensure
not misbehaving, so not helping)
Regarding https://github.com/ruby/timeout/issues/48#issuecomment-2095435224 that looks like a safe way to interrupt (which might of course not work if not enough checks) but it feels something that can be easily handled without Timeout
, i.e. just comparing some Process.clock_gettime(Process::CLOCK_MONOTONIC)
values.
Maybe something for documentation rather?
I read your whole post but I am still not entirely sure why you want a soft exception and a hard exception.
Thanks for reading and your response. Wow, I'm really not being clear 😅. Below I'll explain my goal, why I'm now realizing my idea is probably not tenable, and why maybe Ben's idea is.
I am trying to enable a concept of "safe shutdown" across the ruby ecosystem. Examples of this:
So, a loop checks on each iteration "should i keep going?".
If there were a "universal shutdown exception class" in ruby, loops could rescue it and halt work. Timeout could raise this exception in the thread, and check if the exception bubbled all the way back up. If it does, then Timeout will assume all the inner code is behaving well. As I'm writing this I'm realizing this isn't tenable for the same reason as @headius wrote about 15 years ago, the exception can be raised anywhere including inside an ensure
. To support my model (or some other solution), ruby-core would need to encode special behavior in ensure
, probably for ruby 4. (which is probably possible and I should start proposing that elsewhere).
Meanwhile, Ben's idea seems pretty awesome (i'll add the "soft" concept)
loop do
break unless Timeout.soft_time_remaining > 0
do_work
end
it feels something that can be easily handled without Timeout, i.e. just comparing some Process.clock_gettime(Process::CLOCK_MONOTONIC) values.
But what I'm trying to do is make a standard way to support a concept of "external signal to shut down" - so activerecord find_each could implement this, and then rails users could pick whatever timeout value they like from the outside.
what if other layers of application or library code catch Timeout::ExitException
Then it's a clear bug of that application/library code. They should never catch Timeout::ExitException.
I only proposed that because I thought you were suggested it 😆 when you said "would documenting Timeout::ExitException and make it non-internal be a solution here?". I agree that code should not catch this internal exception.
To support my model (or some other solution), ruby-core would need to encode special behavior in
ensure
, probably for ruby 4. (which is probably possible and I should start proposing that elsewhere).
There is already a ruby-core issue about this which you might find interesting: https://bugs.ruby-lang.org/issues/17849
I only proposed that because I thought you were suggested it 😆 when you said "would documenting Timeout::ExitException and make it non-internal be a solution here?". I agree that code should not catch this internal exception.
To be clear, by "catch Timeout::ExitException" I meant rescue it and not re-raise it.
I think it's totally fine to rescue it if it is re-raise it after, if one needs special cleanup on timeout and that code can't just be in an ensure
(well except that's it's an internal name that might change, but unlikely and we could stop documenting it as if it's going to change).
Re Timeout.soft_time_remaining > 0
, how would this work if there are nested Timeout.timeout usages? That sounds rather tricky to implement (but I have not tried it).
I think that functionality should be a predicate for simplicity, like Timeout.soft_timed_out?
, but of course same issue regarding nesting (although then it's just a boolean, so could maybe be stored as a fiber-local and saved/restored).
Wow, thanks for pointing me to that ruby core discussion!
how would this work if there are nested Timeout.timeout usages
good point, invocations would maybe be aware of one another and do some appropriate math
@mperham in a slack discussion mentioned go Contexts
considering all of the above, i think my idea doesn't need to and probably shouldn't live in Timeout - it can stand on its own. I'll make a little POC and see what folks think.
Thanks for the discussion!
To time out gracefully, the inner code needs to be told stop doing work. graphql-ruby has such a timeout
This can already be done with
Timeout
using 2 blocks, with the inner one sending a special exception, as shown below. This doesn't solve things like "what if the timeout exception is raised inside anelse
block", but I think the venn diagram is such that ifShutdownRequest
is caught and the code comes back, it's safe.This is often not practical because the code we want to time out is many layers of libraries we don't control, and can't accommodate our special shutdown method.
If this pattern is indeed safe, maybe the timeout gem could formalize it, by:
a. defining its own Shutdown class b. allowing specification of both "shutdown signal time" and "strict time", which would implement the double layer pattern shown above
Then, libraries and application code could support this concept. For example, rails
ActiveRecord::Relation#find_each
could rescueShutdownRequest
, then raiseShutdownHandledGracefully
. if the outer code doesn't catch that, that's good - it will roll back a transaction and work through any chain ofensures
so things are cleaned up. if it all happens within the soft time, then the state is clean. If not, then the process should be killed. (as with pitchfork or rack-timeout term_on_timeout.If there is some reluctance to code the hard/soft functionality, but the pattern is sound, maybe as a first step, the class could simply be defined.
Timeout::Shutdown
- this gives implementers at all levels the opportunity to start supporting the pattern, referencing the same class.