Closed andrewvc closed 6 years ago
Left some minor notes, the thread interruption handling logic seems good.
nit: I noticed that @cancel_mutex
is useless now, we could remove it.
@andrewvc @jordansissel following up on our conversation about ConcurrentHashMap
forEach
behaviour in this situation:
@threads_to_start_time.forEach do |thread, start_time|
@threads_to_start_time.compute(thread) do |thread, start_time|
...
end
end
From what I can read in https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/package-summary.html#Weakly
Most concurrent Collection implementations (including most Queues) also differ from the usual java.util conventions in that their Iterators and Spliterators provide weakly consistent rather than fast-fail traversal:
they may proceed concurrently with other operations
they will never throw ConcurrentModificationException
they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.
So I am not sure about the behaviour of forEach
but I would suspect it is also weakly consistent so I think instead of using @threads_to_start_time.compute
we could simply use @threads_to_start_time.computeIfPresent
and that would account for the possibility of having the thread removed from the collection while in the forEach
loop.
@colinsurprenant makes sense to move to computeIfPresent
, I'll make that improvement.
I changed this in a few ways:
start_thread_grokking
method for simplicity (it was called in one spot)compute
to computeIfPresent
thread.interrupted
before we start grokking. This is unnecessary@andrewvc nit: just noticed that @cancel_mutex
is still defined but unused https://github.com/logstash-plugins/logstash-filter-grok/blob/eb88860b88beb999e4c2ab329fbe89bb8026e020/lib/logstash/filters/grok/timeout_enforcer.rb#L12
@colinsurprenant just removed that extra line, good catch!
Another observation: this is not part of the change set but might be a good idea to modify: the @running
bool is used across threads to control termination, I'd suggest we make it an AtomicBoolean
to make it explicitly threadsafe. https://github.com/logstash-plugins/logstash-filter-grok/blob/eb88860b88beb999e4c2ab329fbe89bb8026e020/lib/logstash/filters/grok/timeout_enforcer.rb#L31
@andrewvc I leave it to you to decide for @running
since I believe this will not have practical impact.
LGTM!
Really good job on this!!
@colinsurprenant moved @running
to an atomic boolean, good catch. Apparently it didn't cause a problem before, but it definitely wasn't right.
Andrew Cholakian merged this into the following branches!
Branch | Commits |
---|---|
master | 82ec77961a497b0e1066af08fadbe6b6f84146c7 |
@andrewvc , Is this fix available in LS 5.5.2?
Seems like I have only 3.4.4, and If I understand correctly, this fix is in 4.0.1 ?
$ bin/logstash-plugin update logstash-filter-grok
Updating logstash-filter-grok
Updated logstash-filter-grok 3.4.2 to 3.4.4
Just confirmed I was able to upgrade in LS 5.6.3
$ bin/logstash-plugin update logstash-filter-grok
Updating logstash-filter-grok
Updated logstash-filter-grok 4.0.0 to 4.0.1
This should keep perf close enough. It's a little slower, but worth it for the consistency.
Old perf as measured with
time
: 54.57 real 216.58 user 3.51 sys New perf as measured withtime
: 59.58 real 238.70 user 4.21 sysTest config: