elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
72 stars 3.5k forks source link

Can we Flatten `RubyEvent` and `Event` into one? #7895

Open original-brownbear opened 7 years ago

original-brownbear commented 7 years ago

RubyEvent doesn't hold any state that isn't also in Event, just from the performance perspective it's only a burden. I can see about 1.5 * workers * batch_size Event not being GCed at any given time for small batch sizes (up to 256 and a much larger factor than 1.5 for larger batch sizes). Given that these RubyEvents prevent a ton of HashMap from being GCed, I expect a visible improvement in memory use from flattening this out.

Is there any reason not to simply move all the RubyEvent methods over to Event? (if we want to cleanly separate the Java interface we could still extract the current public methods from Event to an interface that new merged class implements?)

original-brownbear commented 7 years ago

@jordansissel @suyograo this would def. be worthwhile, tried it out and got about an ~8% throughput increase for input { generator{}} output {stdout {codec => "dots"}} which will likely be even higher for some other cases with heavy get and set action on the Event. Unfortunately, it's not really possible to make this change in a manner binary compatible with the date filter (user-agent filter also needed some hacky inheritance). Still a pretty much mandatory fix for 7.0 imo. We shouldn't have a redundant level of indirection on literally every single operation we do on an Event. => fix this in master?

jordansissel commented 6 years ago

@original-brownbear I think RubyEvent at one point was required to provide [] and []= methods to Ruby. Since we now use .set and .get methods instead in both Java and Ruby, maybe we don't need this anymore.

I did a brief glance at e5fb1d2977 where RubyEvent was introduced, and most methods delegate to the underlying org.logstash.Event except for: [], []=, to_hash, and initialize

+1 for trying to remove RubyEvent especially if we can do so without breaking Ruby plugins. Java plugins like geoip, date, and dissect filters all reference RubyEvent directly in the java code.

original-brownbear commented 6 years ago

@jordansissel

Java plugins like geoip, date, and dissect filters all reference RubyEvent directly in the java code.

Yea that's the issue here :( Even if we make Event and interface containing all public methods of the current Event class and keep everything at the same spot (package and classname) we're still not binary compatible (only API compatible => they still need a recompile) :( => if we want to do this, it's gonna be a breaking API change, it would be nice to have this though since it would enable some cleaner removal of the public references the Ruby runtime (RubyEvent has one anyways by being a RubyObject, but if we want to move away from the global constant for testability it would require keeping a redundant reference in Event).