Closed chikinchoi closed 3 years ago
@chikinchoi Can you set the following environmental variable and see if the load balancing status improves?
$ export SERVERENGINE_USE_SOCKET_REUSEPORT=1
$ fluentd -c your-config.conf
Here is some background note:
The "multi process workers" feature is not working. ... For example, worker0 is 98% and worker1 is 0%.
This is actually a common issue among server products on Linux. Nginx has the exact same issue:
https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/
The core problem is that Fluentd itself has no load-balancing mechanism.
It just prepares a bunch of worker processes, each listening on a shared socket.
When a request arrives, every worker wakes up, rashes to accept()
, and
whoever gets there first "wins" (and get the task as a treat).
This model works poorly on Linux, because Linux often wakes the busiest process first. So there is no load balancing. It's just that a single worker winning the game again and again, leaving other workers just slacking off.
The SERVERENGINE_USE_SOCKET_REUSEPORT
mentioned above was
introduced in https://github.com/treasure-data/serverengine/pull/103 to specifically resolve this issue.
This is experimental and not well documented, but it's worth a try if the above issue is bugging you.
Hi @fujimotos ,
Thank you for your quick reply! Is it mean this uneven behavior is expected for the multi worker feature? For the SERVERENGINE_USE_SOCKET_REUSEPORT parameter, is it ok to add it into dockerfile? below is my dockerfile config:
FROM fluent/fluentd:v1.11.1-1.0
# Use root account to use apk
USER root
# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN apk add --no-cache --update --virtual .build-deps \
sudo build-base ruby-dev \
&& sudo gem install fluent-plugin-elasticsearch -v 4.2.2 \
&& sudo gem install fluent-plugin-prometheus \
&& sudo gem sources --clear-all \
&& sudo gem install elasticsearch-xpack \
&& sudo gem install fluent-plugin-record-modifier \
&& sudo gem install fluent-plugin-concat \
&& sudo gem install typhoeus \
&& sudo gem install fluent-plugin-string-scrub \
&& apk add curl \
&& apk del .build-deps \
&& rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem
COPY fluent.conf /fluentd/etc/
RUN mkdir /var/log/fluent
RUN chmod -R 777 /var/log/fluent
RUN chown -R fluent /var/log/fluent
RUN sniffer=$(gem contents fluent-plugin-elasticsearch|grep elasticsearch_simple_sniffer.rb ); \
echo $sniffer
# fluentd -c /fluentd/etc/fluent.conf -r $sniffer;
COPY entrypoint.sh /bin/
RUN chmod +x /bin/entrypoint.sh
# USER fluent
Is it mean this uneven behavior is expected for the multi worker feature?
@chibicode Right. The uneven worker load is an open issue on Linux.
One proposed solution is SERVERENGINE_USE_SOCKET_REUSEPORT
.
It's promising, but still being in the experimental stage. So we haven't
yet enabled the feature by default.
For the SERVERENGINE_USE_SOCKET_REUSEPORT parameter, is it ok to add it into dockerfile?
In your use case, I think the best point to set the env is /bin/entrypoint.sh
.
Add the export
line just before the main program invocation.
Here is an example:
#!/bin/bash
export SERVERENGINE_USE_SOCKET_REUSEPORT=1
fluentd -c /fluentd/etc/fluent.conf
Hi @fujimotos ,
Thank you for your replying. I am testing to add "SERVERENGINE_USE_SOCKET_REUSEPORT" to the entrypoint.sh and will let your know the result once done.
Right. The uneven worker load is an open issue on Linux.
for the uneven worker load issue, I read the fluentd document and saw that there is a "worker N-M directive". may I know what is the purpose of the "worker N-M" if the uneven worker load issue is expected behavior? Thank you very much!
@fujimotos , I have added "SERVERENGINE_USE_SOCKET_REUSEPORT" in entrypoint.sh as the below script but found that the loading to each worker still has different. The fluentd_output_status_buffer_available_space_ratio of worker0 is 92.7% and worker1 is 99.2%. Is this difference expected? Also, how can I verify if the "SERVERENGINE_USE_SOCKET_REUSEPORT" variable is working?
#!/bin/sh
#source vars if file exists
DEFAULT=/etc/default/fluentd
export SERVERENGINE_USE_SOCKET_REUSEPORT=1
if [ -r $DEFAULT ]; then
set -o allexport
. $DEFAULT
set +o allexport
fi
# If the user has supplied only arguments append them to `fluentd` command
if [ "${1#-}" != "$1" ]; then
set -- fluentd "$@"
fi
# If user does not supply config file or plugins, use the default
if [ "$1" = "fluentd" ]; then
if ! echo $@ | grep ' \-c' ; then
set -- "$@" -c /fluentd/etc/${FLUENTD_CONF}
fi
if ! echo $@ | grep ' \-p' ; then
set -- "$@" -p /fluentd/plugins
fi
set -- "$@" -r /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-4.2.2/lib/fluent/plugin/elasticsearch_simple_sniffer.rb
fi
df -h
echo $@
echo $SERVERENGINE_USE_SOCKET_REUSEPORT
exec "$@"
The fluentd_output_status_buffer_available_space_ratio of worker0 is 92.7% and worker1 is 99.2%. Is this difference expected?
@chikinchoi I think a small difference is expected.
You originally reported that the space usage
(fluentd_output_status_buffer_available_space_ratio
) was:
worker0 is 98% and worker1 is 0%.
So worker1 was obviously overworking. On the other hand, the current status is:
worker0 is 92.7% and worker1 is 99.2%.
so I consider this as a progress, better than 0% vs 98% usage.
@fujimotos I found worker0 is 71% and worker1 is 0% today, seems it is still a progress, but do you think there is any way to make it better?
do you think there is any way to make it better?
@chikinchoi As far as I know, there is no other option that can improve the task distribution.
Edit: There is a fix being proposed in the Linux kernel level. But the kernel maintainers are not convinced by that patch.
So I believe SERVERENGINE_USE_SOCKET_REUSEPORT
is currenlty
the best Fluentd can archive to distribute the task load evenly.
Thanks for the resolution, I have tried using it but after making the change in the "export SERVERENGINE_USE_SOCKET_REUSEPORT=1", the other workers (I am using 6 worker node in my configuration) started utilizing CPU for a very short period of time, ~2 minutes and after that everything reverted back as earlier.
Also I am sending the logs to NewRelic using Fluentd, and for most of the server/cluster it is working fine but for few of them it is showing lags from 2 hours and goes even beyond 48 hours.
Suprisingly the logs for one of the namespace I have in my K8s cluster streaming live in the NewRelic however for one of the namespace I am facing this issue. I have tried using directive as well as the solution provided above that reduced the latency from hours to somewhat close to 10-15 minutes but I am still not getting the logs without lag.
Any troubleshooting step would be appreciated.
Im facing with the same problem, any other solution aditional to SERVERENGINE_USE_SOCKET_REUSEPORT ?
So, the load is unbalanced even if setting SERVERENGINE_USE_SOCKET_REUSEPORT
?
How much difference does it make?
So much diference, this is a picture of buffer from yesterdey:
As you can see worker 1 buffer its increasing and the other are "emtpy".
Thanks.
@jvs87 Thanks!
Does this occur even if setting SERVERENGINE_USE_SOCKET_REUSEPORT
?
Yes, it is declared in the env:
Thanks. I see...
I am surprised to see so much imbalance, even with reuseport
.
When I applied reuseport
on nginx, the load was more distributed.
We may need to investigate the cause.
Note: https://github.com/uken/fluent-plugin-elasticsearch/issues/1047
Yes, I'm a little blinded and dont know if the problem is related to multi process or in the other hand to bad use of buffer.
Hi. Do you need any other test?
Describe the bug The "multi process workers" feature is not working. I have defined 2 workers in the system directive of the fluentd config. However, when I use the Grafana to check the performance of the fluentd, the fluentd_output_status_buffer_available_space_ratio metrics of each worker are slightly different. For example, worker0 is 98% and worker1 is 0%.
To Reproduce To Reproduce, please use the below fluentd config:
Expected behavior I expect that the fluentd_output_status_buffer_available_space_ratio should be evenly as the distribution of the loading to each workers should be evenly too.
Your Environment