The default Ruby interpreter MRI makes use of a GIL, which limits parallel execution to only a single thread. Especially in high-load scenarios, e.g. when FluentD is used as a log aggregator, the single-threaded execution becomes a bottleneck. To circumvent this issue, it is possible to either use multiple FluentD instances or to use the multi-worker feature of Fluentd. Using multiple instances of Fluentd often even is desired to allow horizontal scalability, but requires an effective load-balancing solution and results in higher costs, since each instance must be managed. The multi-worker feature solves these issues, but has a number of other problems: It was reported that the workload is unevenly distributed between workers, resulting in poor resource utilization (https://github.com/fluent/fluentd/issues/3346). In addition, the multi-worker feature makes use of processes and interprocess communication, which is much slower compared to multiple kernel threads in the same process.
Therefore, it would be great to use a interpreter without GIL, to boost performance and lower cost. Unfortunately, JRuby and TruffleRuby are currently not able to run Fluentd. In case of JRuby, the FluentD dependency yajl-ruby is not supported, since it doesn't provide a native Java extension. Truffleruby on the other hand is missing some functions (e.g. IO#send_io, IO#recv_io).
Describe the solution you'd like
Adapt Fluentd to allow usage of JRuby/TruffleRuby or contribute the missing functions to the other projects.
Is your feature request related to a problem? Please describe.
Follow-up issue for https://github.com/fluent/fluentd/issues/317
The default Ruby interpreter MRI makes use of a GIL, which limits parallel execution to only a single thread. Especially in high-load scenarios, e.g. when FluentD is used as a log aggregator, the single-threaded execution becomes a bottleneck. To circumvent this issue, it is possible to either use multiple FluentD instances or to use the multi-worker feature of Fluentd. Using multiple instances of Fluentd often even is desired to allow horizontal scalability, but requires an effective load-balancing solution and results in higher costs, since each instance must be managed. The multi-worker feature solves these issues, but has a number of other problems: It was reported that the workload is unevenly distributed between workers, resulting in poor resource utilization (https://github.com/fluent/fluentd/issues/3346). In addition, the multi-worker feature makes use of processes and interprocess communication, which is much slower compared to multiple kernel threads in the same process.
Therefore, it would be great to use a interpreter without GIL, to boost performance and lower cost. Unfortunately, JRuby and TruffleRuby are currently not able to run Fluentd. In case of JRuby, the FluentD dependency
yajl-ruby
is not supported, since it doesn't provide a native Java extension. Truffleruby on the other hand is missing some functions (e.g.IO#send_io
,IO#recv_io
).Describe the solution you'd like
Adapt Fluentd to allow usage of JRuby/TruffleRuby or contribute the missing functions to the other projects.
Describe alternatives you've considered
Implement Ractor based parallelism in serverengine (https://github.com/treasure-data/serverengine/issues/107), which should also result in a performance improvement.
Additional context
No response