emsearcy / fluent-plugin-gelf

Buffered fluentd output plugin to GELF (Graylog2)
Apache License 2.0
33 stars 57 forks source link

Perhaps separating out the plugins? #1

Open kiyoto opened 10 years ago

kiyoto commented 10 years ago

Hi Eric-

Thanks for the awesome work. I noticed that inside lib/fluent/plugin are three plugins. I would say in_gelftail and out_gelf2file can be yanked out of the repo for the following reasons:

I propose that this plugin focus on out_gelf only (and I am more than happy to add tests and package it for Rubygems and send pull requests).

Let me know what you think.

emsearcy commented 10 years ago

I understand how in_tail could be used with custom parsers. One advantage for me, locally (and this could perhaps be retained a site module of mine that extends in_tail) was reuse of large, unwieldy parsers across lots of systems without a lot of copying and pasting. At least within the same config file, is there variable support so that I can define a string once and then use it in several places?

That said, my main reason to create in_gelftail was adding host, and I'm looking into record_reformer now. Can you help me understand something: their examples just show them matching and changing something, but it seems like this would result in nothing being done with the record. Would one create a second <match **>, or nest both "record_reformer" and, say, "forward" inside a "copy" output?

In short, I think the in_gelftail plugin has a lot of value for someone using out_gelf. in_tail has a bunch of presets, but none of them work with GELF because they don't create named captures for full_message or short_message. I would ideally see in_gelftail as having GELF variants of each of the presets of in_tail (syslog, csv), ready-to-use for people, and yet still allowing people to make their own custom parsers. And by extended the current in_tail module, I'm not "reinventing the wheel" and it's actually a very OOP-appropriate thing to do (and I even was able to fix and report an upstream bug in in_tail while developing in_gelftail!).

As for out_gelf2file: you are correct that this has little to do with GELF output, but it is related to my in_gelftail's presets and the format of GELF messages. The idea is that if I'm parsing files in a GELF compatible way, then in addition to fields, I also have a copy of the entire, un-parsed string in full_message (and if not, I have a truncated version in short_message). As long as I'm passing around logs with fluentd, if I want to replicate original log entries verbatim as on the source server—a log that I read with in_gelftail—than all I have to do is output the full_message field into my output file. I'm currently using this with my web server logs: in addition to grabbing all the fields for analysis in elasticsearch, I also have the original log message because it's part of the GELF format, and I can use fluentd to centrally write request logs from multiple web frontends of a single domain into one centralized log on my logging server, which is compatible with traffic analytics programs like awstats without having to do time-ordered log merges.

I supposed with out_gelf2file, a good question is whether or not "write this one field without any other fields, without any field name, and without any delimeters" is something that is easily done and understandable by someone who is reading the fluentd config?

(Also: I really need to document these three in my README.md; I think that would help explain the value of them as well as how they can easily work together with a very easy-to-understand fluentd config.)

kiyoto commented 10 years ago

Eric- Thanks for the feedback.

  1. record_reformer: Indeed, record_reformer can be used to modify the value of the existing field since it can accept a Ruby expression. For example, if you want to downcase a field called "name", then you can do

    name ${name.downcase}

    in your record_reformer config. Note that it has to be an expression.

  2. I now understand your point about in_gelftail better. I think it is fair to keep it there. As you said, it would be great to document what it does.
  3. Quoting you:

    I supposed with out_gelf2file, a good question is whether or not "write this one field without any other fields, without any field name, and without any delimeters" is something that is easily done and understandable by someone who is reading the fluentd config?

    The answer is yes (to be fair, this was not the case until recently). Now, Fluentd has a (soon to be documented) feature called TextFormatter, which makes text formatting pluggable for any output plugins that implements it. out_file is one of them, and one of the default formats is called "single_value". So, if you specify

    type file
    format single_value
    message_key just_this_field_pal # default key is "message"
    add_newline true # true by default, can disable it to NOT append "\n"

    Then, I believe we might be able to do away with out_gelf2file.

In sum, here is my revised and better informed suggestion:

  1. We are keeping in_gelftail
  2. We remove out_gelf2file but supplement the repo with an example config with proper out_file config
  3. Add README.md, tests, packages, etc. All the goodies

Let me know your thoughts!