Basic Systemd Journal Entry Mutation

emacski commented 7 years ago

Hi,

I needed a bit more flexibility in tailoring the resulting journal entry object to conform to certain conventions and existing elasticsearch mappings that I use for logging. I created a journal entry mutator to encapsulate the transformation logic and integrated it with the fluentd systemd input plugin as well as wrapping it with a fluentd filter plugin. Configuring the functionality is the same for the input plugin and filter plugin.

Option	Type	Description
`field_map`	hash/object	map source fields to existing or new fields (can also combine fields)
`field_map_strict`	bool	if true, only destination fields from `field_map` are included in the result
`fields_strip_underscores`	bool	same functionality as before, but is only applied to non-mapped fields
`fields_lowercase`	bool	lowercase all non-mapped field

Input Plugin Config Example:

<source>
  @type systemd
  path /var/log/journal
  filters [{ "_SYSTEMD_UNIT": "docker.service" }]
  read_from_head true
  tag docker
  <entry>
    field_map {"MESSAGE": "msg", "_PID": ["process", "pid"], "_CMDLINE": "process", "_COMM": "cmd"}
    field_map_strict false
    fields_strip_underscores true
    fields_lowercase true
  </entry>
  <storage>
    @type local
    persistent true
    path /var/log/flutned/docker.log.pos
  </storage>
</source>

Filter Plugin Config Example:

<filter docker>
  @type systemd_entry
  field_map {"MESSAGE": "msg", "_PID": ["process", "pid"], "_CMDLINE": "process", "_COMM": "cmd"}
  field_map_strict false
  fields_strip_underscores true
  fields_lowercase true
</filter>

Given a source entry:

{
  "_MACHINE_ID": "bb9d0a52a41243829ecd729b40ac0bce"
  "_HOSTNAME": "arch"
  "MESSAGE": "this is a log message",
  "_PID": "123"
  "_CMDLINE": "login -- root"
  "_COMM": "login"
}

The result using the field_map from above:

{
  "machine_id": "bb9d0a52a41243829ecd729b40ac0bce"
  "hostname": "arch",
  "msg": "this is a log message",
  "pid": "123"
  "cmd": "login"
  "process": "123 login -- root"
}

If this is something you are interested in merging into the main project, I am happy to update the readme with instructions on use. Backwards compatibility is maintained with the original strip_underscores option in the input plugin with the addition of that option being marked as deprecated. Just FYI, this is my first ruby and I tried my best to be as idiomatic as possible.

errm commented 7 years ago

Wow great work @emacski looks very nice... I need to take a look at updating the syle linting to pass in master... but your code looks great...

emacski commented 7 years ago

Thanks for the kind words @errm. I did notice the style offenses in the build, but also noticed offenses for existing code. If the style settings are correct (however you want them to be), I can address the style offenses in master in a separate PR that can be merged first and then update the style for this PR. Should be easy enough, especially if it's just a matter of running reevoocop -a and doing a couple manual fixes.

errm commented 7 years ago

Yeah the style stuff was intentionally broken here https://github.com/reevoo/reevoocop/pull/20

Just didn't get round to fixing all of the collateral damage yet... I have however now fixed this on master... If you could rebase your code that would be great...

emacski commented 7 years ago

ok, I have rebased (and squashed) the branch having resolved almost all the style offenses. Regarding the remaining offense, I've never encountered (that I can remember) the enforcement of class length as part of a style guide. That being said, I'm not quite sure how strictly something like that is enforced by you, or in general. For example, it seems like it's ok to ignore class length in tests? FYI I used the ignore comment for the entry test data static vo.

As part of my effort to better understand ruby, I went ahead and did some refactoring in another branch here: https://github.com/emacski/fluent-plugin-systemd/tree/feature/journal-entry-filter-alt where I have centralized most of the integration between the standalone mutator and the fleuntd plugin framework. While I don't think it's really necessary in this case, I did create and implement my first ruby mixin as this seems like a very popular approach in ruby for code reuse. I don't typically start off with these types of abstractions for a single implementation, but was curious as to how mixins work in ruby. The mixin is reusable should there be any other systemd plugins in the future wishing to embed the mutator functionality as a feature of the plugin, while the filter still makes use of the more compositional approach. Let me know if you'd prefer the refactored version (or if you have comments on it), or if you could provide some guidance on the current version for resolving that last issue. Oh! The refactored version passes reevoocop. Almost forgot to mention that, haha.

Thanks for putting up with a ruby n00b!

reevoo-samuel commented 7 years ago

Er, hello?

This Pull Request is too damn old! Merge or close this, sucka.

errm commented 7 years ago

Hi @emacski sorry I didn't have time for a full review yet ... I will try and get to it today or tomorrow...

emacski commented 7 years ago

No worries and no rush. I am a fan of the Sam Jackson whiner though, nice.

errm commented 7 years ago

Oh and re the class length reevoocop thing, its fine to disable in tests ... for implementation, if there is some refactoring that can be done to keep things more manageable I prefer that, but if its not over by much and there isn't anything obvious I am ok with disabling that too...

emacski commented 7 years ago

I think I have covered all comments.

One thing to note, by moving the behavioral logic for a non-configured or default mutator (i.e. when an block is not defined) into the mutator itself and now that the mutator depends on fluent, it doesn't seem necessary to introduce another creational pattern such as @mutator = Systemd::EntryMutator.build(@entry_opts).

Additionally, a mutator instance now persists the config state so that implementations of the mutator don't need to worry about this while implementations now have access to config state via mutator instance properties for behavioral logic (i.e. flow control). With the tighter coupling to fluent, the mutator uses a more formalized data structure for it's internal configuration in an effort to mitigate ambiguity in maintaining parity between mutator instance configuration and the fluent plugin config (config_section) as well as making it obvious which config directives the mutator depends on.

errm commented 7 years ago

Thanks @emacski I am V.Happy with this now...

I think we should have some documentation... I am happy to just grab the contents of the PR description and chuck them in the README as a start, if that is ok?

emacski commented 7 years ago

hey @errm,

I took a quick stab at updating the README. It's basically what was in the PR description. I'm totally cool if we want to rip the README commit out if you had something else in mind. Just trying not to increase your workload for my changes.

errm commented 7 years ago

Thanks @emacski that is perfect... certainly better that I would have done...

fluent-plugins-nursery / fluent-plugin-systemd

Basic Systemd Journal Entry Mutation #36