Closed colinsurprenant closed 9 years ago
I'll get you some feedback on this soon.
For this I suggest that the plugins exposes only 3 methods: stop, stop? and close.
I might be misreading a bit, but if we consider the public interfaces, is 'stop' the only public interface of these three? Seems like 'stop?' would only be invoked inside a plugin to inquire if it should stop, and 'close' would only be invoked by the plugin as part of of the 'run' method (for inputs) completing.
What we have today is a kind of in-band signalling (the LogStash::ShutdownEvent), and you propose an out-of-band signal (plugin.stop). I think this is good, but I wonder if in-band is even needed anymore once we have a way to interrupt with persistence of in-flight state. Thoughts?
The public interfaces would be stop
and close
. close
(the current teardown
) should be called by the inputworker in the pipeline in the ensure
block after run
:
def inputworker
plugin.run
rescue ..
# ..
ensure
plugin.close #bookkeeping
end
Only stop?
would be private because it's used by the plugin itself.
One more detail: since only inputs need an out of band signal (they don't receive events, they generate them), filters and outputs can still use the shutdown event. This is good to ensure data safety, changing filters and outputs to use stop
/stop?
would potentially bypass a non-empty queue and lose data.
[EDIT] I now think stop? should also be public, since the pipeline needs it to know when to ignore exceptions when plugins are shutting down
I've created PR #3812 (now #3895) that implements @colinsurprenant original idea with a couple of modifications:
stop
, stop?
, close
refactor to the LogStash::Inputs::Base
class: this only concerns the shutting down of inputs. filters and outputs still use the ShutdownEvent;close
back to teardown
to reuse the current semantics of teardown
: "perform additional bookkeeping at shutdown time". This reduces the refactor work.This PR then changes the Plugin API to:
stop
(public), stop?
(private) to LogStash::Inputs::Baseshutdown
, finished
. finished?
, running?
, terminating?
from LogStash::PluginI've opened a meta issue to track which plugins need to be refactored: https://github.com/elastic/logstash/issues/3813
I've also added a PR against logstash-devutils to add a test most input plugins can use to test that stop
properly exits run
https://github.com/elastic/logstash-devutils/pull/32
Just asking, is there any reason not to have a running?
method which is basically this:
def running?; !stop?; end
it make writing code in plugin a bit more positive..
while running?
end
instead of
while !stop?
end
+2 on being positive :-)
On Tue, 1 Sep 2015 18:42 Pier-Hugues Pellerin notifications@github.com wrote:
Just asking, is there any reason not to have a running? method which is basically this:
def running?; !stop?; end
it make writing code in plugin a bit more positive..
while running?end
instead of
while !stop?end
— Reply to this email directly or view it on GitHub https://github.com/elastic/logstash/issues/3210#issuecomment-136787538.
Note: another common pattern will be break if stop?
Nothing against running?
, though I'd prefer to reduce the api to just 3 calls: .teardown
, .stop
and 1 other method to check if stop was called. Not sure if it's worth to widen the api. We can just replace stop?
with running?
I prefer positive predicates, but in this case break unless running?
feels ... weird. We have to have a negative to determine when to stop (break if stop?
vs break unless running?
) so we can't avoid it, I don't think. The alternative is something wierder while running? ... end
that makes it maybe less explicit about when we can break. I dunno. I'm neutral on stop?
vs running?
partially because stop
is either a noun or a command, where running?
is more a question or an active verb. Questions feel more appropriate for predicate methods running?
or should_stop?
or osmething. I don't know. Neutral or leaning towards positive naming :)
since only inputs need an out of band signal
When persistence lands, won't we want out-of-band signalling to terminate a filter/output?
Yes they'll need something similar, but I'm not sure if writing that code in advance is the "right thing". Whenever persistence is introduced, this api could move the LogStash::Plugin class, it shouldn't require another huge refactoring since it's just adding new stuff (stop, stop?)
Votes? :)
# stop?
break if stop?
next unless stop?
while !stop?
# ..
end
# stopping?
break if stopping?
next unless stopping?
while !stopping?
# ..
end
# stop_called?
break if stop_called?
next unless stop_called?
while !stop_called?
# ..
end
# stop_requested?
break if stop_requested?
next unless stop_requested?
while !stop_requested?
# ..
end
# should_stop?
break if should_stop?
next unless should_stop?
while !should_stop?
# ..
end
# terminating?
# potentially also renaming plugin.stop to plugin.terminate
break if terminating?
next unless terminating?
while !terminating?
# ..
end
# running?
break unless running?
next if running?
while running?
# ..
end
[EDIT] added running?
From your examples @jsvd I would avoid the should option, I feels strange to me. On the other side, what is the problem of having only stop?
, if need more context I like the idea of having things like requested_to_stop?
... but I think this are very personal things.
I am okay with stopping?
and running?
since they both are positive predicate and I don't see an issue of having them both. Their concept and usage is similar to #present?
and #empty?
yeah! @ph like being positive! and one can be just the negation of the other.
I don't like stopping?
since it implies that it is in the state of performing a stop but in reality it only reports that the stop
method was called. In that respect stop_called?
, stop_requested?
and should_stop?
are better but I think stop?
is just as well self explanatory and shorter and cleaner :P so my vote is stop?
.
I don't dislike double non-positives ;) running?
could be just fine too but since the api is to call stop
, then having stop?
is more obviously related IMO.
since we are introducing braking API changes, I would bite the bullet and include the OOB signalling for filters and outputs too. if we postpone we'll have to do another major version bump. also, we'll be ready for persistence when it lands without necessarily doing a major version bump.
as I commented in #3812 - if we are to break the API and must go through all plugins, any reason to keep teardown
around? IMO close
is more appropriate and commonly used for this?
yep we had this discussion before, I've provided the example of test suites as use of the teardown
metaphor.
At this point I don't mind whatever we call it, I'll just need to rescan all plugins and open new tasks for refactoring if we decide to rename it.
As for including the OOB signalling for filters/outputs, it will not break the existing API, just increase it (unless we rename teardown
to close
), so no reason to do it now. The only thing necessary to do NOW in the filters/outputs is to remove the "useless" methods like finished?
Input plugins are now shutdown by an external call to plugin.stop
instead of catching LogStash::ShutdownSignal exception.
Unless overridden, stop
will simply makes stop?
return true
, thus allowing run
to poll this and return after seeing the change.
In some plugins extra work must be done in stop
to instruct run
that it's time to return. For example, in the logstash-input-udp
it's necessary to call @socket.close
to make the blocking read on the socket raise an exception, thus breaking out the loop.
So, different input plugins will require different stop
strategies.
Refactoring an input plugin involves:
ShutdownSignal
exceptionrun
loop: is it a while
? a consumer.subscribe {|event| }
? is there a blocking operation on a socket/fd?stop?
and/or override stop
to make run returnclose
(currently done in teardown
)shutdown
, finished
, finished?
, running?
or terminating?
Then for testing you can use the shared example provided in https://github.com/elastic/logstash-devutils/pull/32 in the following manner:
describe LogStash::Inputs::Http do
let(:port) { rand(5000) + 1025 }
it_behaves_like "an interruptible input plugin" do
let(:config) { { "port" => port } }
end
end
Both will still shutdown using the ShutdownEvent sent by the pipeline so no major changes necessary. The work in these plugins is:
shutdown
, finished
, finished?
, running?
or terminating?
teardown
to close
We need to refactor the shutdown sequence involving the input plugins and establish a clear shutdown semantic.
Current Design
Currently the base plugin class exposes the following methods related to shutdown:
shutdown
finished
teardown
finished?
running?
terminating?
Current input plugin shutdown sequence description: There are basically 3 situations:
1- all input plugins have exited normally, they have somehow reached the end of their input:
run
method, the plugin thread will exitteardown
method2- a
SIGINT
orSIGTERM
is received and the pipeline needs to terminate the input plugins and the pipeline:b)
Pipeline#shutdown
is called from the signal handler:thread.raise(LogStash::ShutdownSignal)
thread.wakeup
teardown
methods are called in sequencerun
method and threads at which point the pipeline thread will again call the inputteardown
methods3- an uncatched exception in the input plugin
teardown
method is called from the pipeline threadrun
method again in abegin/retry
Problems
LogStash::ShutdownSignal
exception can happen virtually anywhere, event in theteardown
methodteardown
is called twice on the input plugins in aSIGINT
/SIGTERM
situationSuggested Refactor
Here's my suggestion for a first iteration cleanup. Once cleaned up, we will be able to more easily build upon it and start thinking about decoupling further the plugin execution/lifecycle from the pipeline in further iterations.
We basically need 2 things for the input plugins:
1- signal the plugin it need to stop 2- ask the plugin to do its bookkeeping once it is stopped
For this I suggest that the plugins exposes only 3 methods:
stop
,stop?
andclose
.stop
is called from outside the plugin thread (from the pipeline thread) and is to ask the plugin to stop. this method must be thread-safe bacause it will be called from outside the plugin thread.stop?
returns true if the stop method was called and can be used within the plugin to verify if a stop was requested.close
is called once when the plugin is stopped to perform any final bookkeeping.The way to know a plugin is stopped is simply by waiting the return of the plugin
run
method. When the run method exits and the plugin thread exits, this is only when theclose
method should be called, once.The same semantic can be applied to the filter and output plugins which would support 2 shutdown modes:
1- the current mode of inserting a
SHUTDOWN
event in the queue so that all inflight events will be processed before the filter and output plugins are closed 2- a new mode using thestop
method for an immediate shutdown in which case the inflight events will be lost when not using queue persistence, in other words, with queue persistence, we will be able to do an immediate shutdown without loosing inflight events.