toshitanian / fluent-plugin-out-http-ext

A generic fluentd output plugin for sending logs to an HTTP endpoint. (deprecated)
Other
6 stars 6 forks source link

Using fluentd as tracking server - is it bullet proof? will I loose lines of logs? #19

Open nitzanav opened 7 years ago

nitzanav commented 7 years ago

I just thought using this plugin to make a a scalble trackign server out of nginx+fluentd

I thought that you might be interested in help me out, I posted a git issue and a stackoverlfow question: https://github.com/ento/fluent-plugin-out-http/issues/17 http://stackoverflow.com/questions/41434437/fluentd-filebeat-rsyslog-output-to-script-curl

Sorry for opening an issue for a question, but I didn't know who else to turn to :) Hope to get your support.

toshitanian commented 7 years ago

Hi, @nitzanav I checked the issues It is an Interesting use case that you use this for message queuing. Basically, this plugin is bullet-proof when you make settings properly.

We use this plugin in our system and it's running on thousands servers. I can show some tips.

http error response sometimes block process

When this plugin gets not OK response like 400, it raises error and retries to send same record infinitely. For example, once log line with invalid data for server is generate, the line blocks all the following log lines. you can use ignore_http_status_code if you don't care error status code.

you can make file buffer before this plugin.

this plugin itself doesn't have buffering feature. you can use bufferize plugin to make buffer in front of this plugin. if you make a proper setting, data loss will be minimised. you have to avoid making buffer in volatility path like /tmp

nitzanav commented 7 years ago

Thank you so much for replying.

Actually I didn't get the blocking issue, how will a retry block? in buffered output, you have exponential back-off with doubling retry-wait, so the other lines will continue on to be processed. and then no blocking occur. I see the bufferize plugin extends BufferedOutput so it needs to be OK, Isn't it?

Here is my plan, can you give me feedback if it is a good plan?

  1. plugins - I am going to use out-http plugin with bufferize plugin, and thinking to use the combined out-http-buffered plugin, but its last commit is in 2013 so I will have to extend it.

  2. retries - 4xx and 5xx errors are common thing in my scenario and I wish to retry, not loose a single line and never make multiple http request for the same line , the exponential back-off and the default 17 retry_limit is good for me. I will set secondary output to file.

  3. chunk size = 1 buffer_chunk_limit will be set to 1 :o) why? because http failure is common and double http request to the same row is bad, when a chunk fails in the middle, then a retry will retry also the first lines that succeeded in the previous try, it is not good. So I rather separate each row as separate chunk that will be retried on its own. I am aware that it will introduce performance issues, I will increase thread_limit as much as needed.

  4. queue - buffer_queue_limit=1M with buffer_type=file and buffer_queue_full_action=exception with sms notification, no data lose on exception right?

  5. flush ASAP - minimal flush_interval=1s, to reduce the queue size and get results in analytics ASAP. The chunks size is 1 anyway. Is it safe?

  6. HA - I don't want to loose a single line. I plan on no fancy HA network topology, just a single nginx access log with a single fluentd agent that calls HTTP. Since I have one fluentd for each nginx, then I have nothing to do more to improve HA, can I improve anything more than this?

toshitanian commented 7 years ago

I thinks it's good :) You can try and fixes it 👍