Need to work Fluentd with ignoring input plugins

sabottenda commented 10 years ago

I want to work fluentd without input plugins if the configuration has <source> options. It means the Fluentd does not receive new messages and only work with current remaining buffers. The use case is:

Flushing all buffers without receiving new messages for retiring servers.
Debugging output plugins

Of course this feature is accomplished by modifying configurations manually, but I want it to be supported officially. For example, it is difficult to modify a part of configurations for specific servers if they are provided via http.

There are two candidates for the implementation I think.

Fluentd starts normally, then send signal such as a SIGUSR2 to a working process to disable input plugins.
Fluentd starts with a option such as a --without-source to disable input plugins.

I think second one is better because first one requires new signal.

Actually it is good for debugging purpose that disabled input plugin is selectable, but I don't care it for now.

Are there any comments or objections for this feature?

sriram-ns commented 10 years ago

+1 for this feature request

sonots commented 10 years ago

:+1: My colleague also said as he wants the same feature. I think SIGUSR2 is fine.

tagomoris commented 10 years ago

This feature can be implemented without signals, so I think we should not use signals. +1 for a new command line option.

taichi commented 10 years ago

+1 for a new command line option. signal handling is hard to implement on top of JRuby.

yyamano commented 10 years ago

@sabottenda In the first use case, is restarting fluentd with --without-source acceptable? Could you explain the context in a little bit more detail? I prefer --without-source option, but not sure if it meets all your needs.

yyamano commented 10 years ago

Another thing comes to my mind. It might be overkill for the use cases, but providing web apis to control fluentd would be nice to me.

sonots commented 10 years ago

Let me talk about the usecase of my colleague. He wants this to make sure all buffer is flushed before shutdown. To do so, input must be stopped before shutdown because, otherwise, we never can make sure all buffer is flushed. He is actually having a problem that some chunks are lost with the current Fluentd on shutdown and tweaking (only) in_forward to let it stops receiving new data by sending signal.

Restarting Fluentd with --without-source does not satisfy this use case.

sonots commented 10 years ago

One another option is to use debug_agent instead of signal. I remember that the debug_agent would become default.

But, I do not like that another port is used as default, and I believe some users should have same opinions. The solution about this is to use UNIX domain socket in the default, but unix domain socket does not work on windows. Hmm...

sriram-ns commented 10 years ago

Thanks sonots. That's my use case too. But, if the feature is implemented with --without-source option, I was planning to do the following -> Shutdown fluentd. We may still be left with some buffer data (not sure if I will lose some data here!) -> Start fluentd again with --without-source option -> Shutdown again ensuring that the entire buffer is flushed

One thing with user signals is, what if fluentd has to support "n" different features ( not sure if it will ever have to) like this. You cant have n different USR signals, rt? I am new to fluentd, so please correct me if I have understood something wrong here. Thanks for your support.

sonots commented 10 years ago

@sriramflydata In the case of mem_buffer, it does not work...

sabottenda commented 10 years ago

He wants this to make sure all buffer is flushed before shutdown.

This is exactly what i want. In my case, it is acceptable to restart fluentd with command line option such as --without-source because I don't use mem_buffer for such case.

repeatedly commented 10 years ago

Since Fluentd v0.10.50, input plugins are stopped before output plugins so the stability is better than before.

I want to avoid adding SIGNAL because handling SIGNAL is hard on some environments and SIGNAL slot is limited. Better approach seems to provide option or another agent plugin. Hmm...

repeatedly commented 10 years ago

--without-source is good for most cases? If so, I will implement it.

tagomoris commented 10 years ago

As a future option, we can use unix domain sockets on path with pid (for only unix-like OSs ...) for fluentd daemon controls, instead of signals. TCP ports are also available for this purpose, but port number selection is another confusing points, especially for multi-process environment.

sonots commented 10 years ago

Using a named pipe would be good, as a future option?

repeatedly commented 10 years ago

I'm not sure how much time do we need to implement unix domain socket approach.

@tagomoris @sonots Can you estimate the implementation cost?

tagomoris commented 10 years ago

Implementation of RPC layer on HTTP is not so difficult, but we need more gem dependencies, especially rack. That brings some troubles for maintenance. Without HTTP, we are needed to design text/binary protocol from scratch, and it will needs more time, and it is not an easy work.

repeatedly commented 10 years ago

I send --without-source option PR. See: https://github.com/fluent/fluentd/pull/377

fluent / fluentd

Need to work Fluentd with ignoring input plugins #279