Closed colinsurprenant closed 5 years ago
I think it could be useful to switch implementation to use jruby's Timeout
, but would prefer to modify the abstraction of a TimeoutEnforcer
instead of ripping it out entirely.
I've done so on a branch here: https://github.com/yaauie/logstash-filter-kv/commit/08f6945ce3fd922a5eba674d0fa4a03f4710fd4f
If we do chose to rip out this implementation entirely, the initialize_timeout_enforcer
method and TimeoutException
class would be orphaned and would need to be be ripped out.
@yaauie I see your point but I am not sure we need to abstract a simple timeout here, there is not really any complexity to abstract, it is already abstracted in the Timeout
class and frankly the chances this timeout implementation changes is rather small, if anything, potential bugs or improvements will be done in the Timeout
class itself. Also, I think the efficiency of creating/calling a new closure versus a code path choice with a simple if
is arguable and marginal at best. I actually like the simplicity and explicitness of the new construct, but this is more style than functionality.
Good catch on the initialize_timeout_enforcer
left over - I will remove it. For TimeoutException
it is still used and passed to Timeout.timeout(@timeout_seconds, TimeoutException)
which allows keeping all the rescue clauses intact. We should definitely keep that.
Thanks @yaauie for the review.
Note that before merging we should do some more manual sanity tests (and bump version) and report back here. /CC @jsvd
Similar fix in logstash-plugins/logstash-filter-grok#147
These are quick smoke performance tests where EPS is visually approximated using per second metrics stats.
This PR using the Timeout
class with timeout_millis => 0
yields between 80k to 84k EPS and using timeout_millis => 30000
yields between 78k to 82k EPS.
v4.3.3 with timeout_millis => 0
yields between 78k to 82k EPS and using timeout_millis => 30000
also yields between 78k to 82k EPS.
We can see that the performance signatures are very similar so there should not be any noticeable impact, especially when overall performance will be limited by the real regex parsing performance cost (since we used extremely short string to try to factor out the actual regex parsing cost here for these tests).
The following pipeline was used for testing performance (where timeout_millis
was changed).
bin/logstash -e 'input{generator{lines => ["aa", "bb", "cc", "dd"] count => 1000000}} filter{kv{timeout_millis => 0 }} output{stdout{codec => dots{}}}
v4.4.0 published.
Per elastic/logstash#10976 the usage of native thread interruption for controlling timed execution seems to be creating problematic side effects.
This PR replaces the original timeout enforcer logic with the usage of the
Timeout
class which should behave a lot better.