Closed original-brownbear closed 6 years ago
I think you can also get some (i think) easy performance boost by using a thread pool here (or executor service) instead of creating a new thread every time.
@jakelandis but I'm only creating that thread once when starting Grok? I'm not creating any new threads periodically as far as I can see.
Ahh nevermind, i was mis-understanding how this worked. (also why I shouldn't do the review for this change :) )
@andrewvc how about this one https://github.com/logstash-plugins/logstash-filter-grok/pull/125/commits/1601d93a5493af4928193ed61ca1d5c0f315d62c ?
This actually seems to be sufficient to make things safe (enough) without locking in non-timeout operation:
stop_thread_groking
to determine whether or not there was actually a grokking in progress when I called the method (this is safe because remove
is atomic on the concurrent hashmap) I avoid killing a thread that is currently doing some non-grok filter/output action accidentallygrok_till_timeout
call unless I know that the timeout enforcer has finished it's interrupt
call (note that it will only go into the branch containing the interrupt
before if stop_thread_groking(thread)
is called in the worker thread)Let me know what you think, this change seems to actually make things faster (in theory, benchmarks didn't change for me) since we save some redundant java.lang.Thread.interrupted
and accidental thread.interrupt()
calls.
Let me know what you think :)
@original-brownbear not to derail, but I'm open to having a timeout feature in the Grok library itself if you think that is more appropriate than having Logstash wrap Grok with this interruption feature.
@jordansissel hmm I don't know enough about the Grok build to answer that I fear :( but :) ...
assuming we can get some Java code compiled in the Grok build we could probably build a somewhat faster and much, much cleaner solution like by simply wrapping the joni regex library on the Java end so that we get:
def execute(text)
@regexp.match(text, timeout)
end
which has the handling of the enforcer thread and interrupting all in Java (should be a visible gain memory wise to save at least on Ruby thread + we can implement some of that interrupting logic a little faster in Java potentially).
More importantly it would make what @andrewvc suggested (making the timeout logic not run if configured with timeout 0
) much cleaner as well (imo).
=> I'm all for it, not sure we want to do that in this iteration though, this seemed urgent?
we could probably build a somewhat faster and much, much cleaner solution like by simply wrapping the joni regex library on the Java end
Yep!
I'm all for it, not sure we want to do that in this iteration though
Agreed. We don't necessarily need to do this now.
Build failure is a preexisting condition btw: https://travis-ci.org/logstash-plugins/logstash-filter-grok/jobs/296953878#L918
I rerun the benchmarks for this new patch: https://github.com/elastic/logstash/issues/8560#issuecomment-344582851
I'm ok with this if we add an explanatory comment.
On Nov 22, 2017 6:12 AM, "Armin Braun" notifications@github.com wrote:
@original-brownbear commented on this pull request.
In lib/logstash/filters/grok/timeout_enforcer.rb https://github.com/logstash-plugins/logstash-filter-grok/pull/125#discussion_r152545854 :
- now = java.lang.System.nanoTime # save ourselves some nanotime calls
- elapsed = java.lang.System.nanoTime - start_time
- if elapsed > @timeout_nanos
- elapsed_millis = elapsed / 1000
- thread.interrupt()
Ensure that we never attempt to cancel this thread twice in the event
of weird races
- stop_thread_groking(thread)
- now = java.lang.System.nanoTime # save ourselves some nanotime calls
- @threads_to_start_time.entry_set.each do |entry|
- start_time = entry.get_value
- if start_time < now && now - start_time > @timeout_nanos
- thread = entry.get_key
Ensure that we never attempt to cancel this thread twice in the event
of weird races
- if stop_thread_groking(thread)
@andrewvc https://github.com/andrewvc you're not, but I think this is a risk we can cleanly take, because with the latest changes we will never interrupt a thread that isn't grokking (sorry for failing to comment on this more explicitly). Think about the orders of magnitude here:
- The cancel thread will do an iteration over a very small number of entries, some numeric comparisons and (if it indeed does have to interrupt some stuff) a volatile write.
In order to kill the wrong grok action, the the filter + output + poll + from queue + start new grokking action in the worker would have to be faster than a numeric comparison + ConcurrentHashMap#delete here. I think that's close to theoretically impossible and practically impossible. I can see how you could make an argument for this happening by virtue of some freak scheduling decision in the kernel, but practically it's impossible imo. Even if there is a theoretical chance (and this is easily less than 1:1M imo looking at the number of cycles the worker has to take for this to happen vs. the number of cycles the enforcer thread uses up), the effect is at worst wrongfully cancelling a grok action.
I can add a sequence number to this to make it 100% safe (that was my initial plan I explained above), but imo it wasn't worth it performance (small hit, but a hit) + complexity wise ( another 10 lines).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-grok/pull/125#discussion_r152545854, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIBYy73bZOIcuHgeOLSVZFlLLPIRvLbks5s5A-hgaJpZM4QSdlr .
@andrewvc added some documentation on it here https://github.com/logstash-plugins/logstash-filter-grok/pull/125/commits/fcf77705d32a69a6a6ffeefee720737bd7c6b128 :)
@andrewvc Thanks! Same test failures happen in master
as well. Will look into them later today (it's my job this week anyhow :))
Armin Braun merged this into the following branches!
Branch | Commits |
---|---|
master | 02121f77ce25fa5ebb53dd3dbc6f2dd539f4d723, dfde844cc0650d4779485776942cfbb19871a4c7, cff80356f05794457474dfc8ffd76f5ed55070c8, ae6e64746d4f7d2441e658f532e9da19d5bde924 |
@andrewvc Is 3.4.5 published to rubygems.org ?
Yes, our internal workflow for doing so hit a snag. @original-brownbear can you look at it when you have a chance?
On Nov 23, 2017 6:33 AM, "IrlJidel" notifications@github.com wrote:
Is 3.4.5 published to rubygems.org ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-grok/pull/125#issuecomment-346633431, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIBYx5ag-q_3c9ueLMcKjZv3xIQpQY4ks5s5YI0gaJpZM4QSdlr .
@andrewvc it looks like this is a job for @jordansissel, not sure what I can do here. He's already been pinged, but I'll keep an eye on this so we won't forget to release this once things are repaired.
Published as 3.4.5
As a correction, this was released as 4.0.0, version 3.4.5 was yanked from rubygems as it broke bwc with logstash 5.5
This is a more than a
100%
speedup for me relative tomaster
when testing against the test configuration posted here https://github.com/elastic/logstash/issues/8560#issue-270112018.I know Travis will probably go red on this one, but it's not broken functionally :) see below:
5.6.2
but have to check later) If we don't want to break BwC I could work around things (just use a per-thread counter for the metrics and increment the metrics every now and then works as well, just more code)yield
(/dynamic block creation) from the timeout grok method and just used an outright method call there (mostly saving memory here, not much throughput gain in my tests)java.lang.System.nanoTime
:),now
was unusedsuccess
(trivial but in JRuby this actually saves memory :D)