sowawa / fluent-plugin-slack

132 stars 52 forks source link

Fluent-plugin-slack as a filter #45

Closed pbenefice closed 6 years ago

pbenefice commented 6 years ago

The point is to be able to use this plugin as a filter. Something like :

  <filter>
    @type slack
    <regexp>
      key message
      pattern cool
    </regexp>
    webhook_url https://hooks.slack.com/services/XXX/XXX/XXX
    channel @pbenefice
    username fluent-slack
    icon_emoji :ghost:
  </filter>

What I tried to do :

  1. Re-use as much code as possible by splitting it in a slack_common namespace
  2. Inherit from GrepFilter for the filter, to be able to only send slack notification when matching regex pattern
  3. Copy tests from the output to apply it to the filter : I'm stuck at the mock tests, can't figure out how to replicate this for the filter..

If you have any feedback, or maybe time to help me a bit with those tests :p

Best regards

sonots commented 6 years ago

I think that filtering is not the responsibility of slack plugin. The fluentd architecture makes possible to combine any kinds of filter plugins with output plugins, but this implementation combines only grep-filter.

You can write grep and slack like:

<label @raw>
  <match **>
    type copy
    <store>
      @type relabel
      @label @slack
    </store>
    <store>
      @type relabel
      @label @something
    </store>
  </match>
</label>

<label @slack>
  <filter **>
    @type grep
  </filter>
  <match **>
    @type slack
  </match>
</label>

<label @something>
  <match **>
    type stdout
  </match>
</label>

Is this not sufficient? I think using this fluentd built-in relabel and filter plugin feature gives more flexibility.

pbenefice commented 6 years ago

Hi @sonots,

Thanks for the quick answer.

I get what you're saying, and to be honest that was what we though we would do at first. But this solution as a few downside for us :

  1. It forces us to duplicate all events with the copy. We have a lot of hits, and would prefer to avoid that.
  2. It add some complexity in the configuration file, which is already pretty complicated on our side
  3. We actually feel like the filter would more flexible : even if you already have a complete stream, you can just add a filter where you want. The grep feature allowing you to select only a desired subset.

To give you a bit of context, here what we try to achieve on our side.

We try to have a fully dynamic streamline. We gather info from a tag set by fluentbit following this pattern : kubernetes.environment.namespace.pod.container.stream Then we play with match and tag rewrites to index data within elasticsearch with the final tag value :

<source>
  @type forward
</source>

#Here we gather tag from fluentbit with a defined pattern
<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key stream
    pattern /^(.+)$/
    # tag: kubernetes.<environment>.<namespace>.<pod>.<container>.<stream>
    tag ${tag}.$1
  </rule>
  @label @input
</match>

#Here we match specific tag pattern, and rewrite the tag with more "end user readable" values
<label @input>
  [...]
  <match kubernetes.production.{production,preview}-front-offices.**>
    @type rewrite_tag_filter
    <rule>
      key $.kubernetes.namespace_name
      pattern /^(.+)-front-offices$/
      # tag: my-app.<environment>.front-offices.<container>.<stream>
      tag my-app.$1.front-offices.${tag_parts[4]}.${tag_parts[5]}
    </rule>
    @label @output
  </match>
  [...]
</label>

#Finally we use this tag to create an index within elasticsearch
<label @output>
  <match **>
    @type elasticsearch
    logstash_prefix ${tag}
    #elasticsearch magic
  </match>
</label>

We would like to use the slack filter somewhere in this configuration to fire alerts when 500 errors occurs in our "front-offices" (for example). We have two solutions :

Option 1 : Playing with copy

As you mentionned, and using your example, our stream would look as follow.

<source>
  @type forward
</source>
<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key stream
    pattern /^(.+)$/
    # tag: kubernetes.<environment>.<namespace>.<pod>.<container>.<stream>
    tag ${tag}.$1
  </rule>
  @label @input
</match>
<label @input>
  [...]
  <match kubernetes.production.{production,preview}-front-offices.**>
    @type rewrite_tag_filter
    <rule>
      key $.kubernetes.namespace_name
      pattern /^(.+)-front-offices$/
      # tag: my-app.<environment>.front-offices.<container>.<stream>
      tag my-app.$1.front-offices.${tag_parts[4]}.${tag_parts[5]}
    </rule>
    @label @output
  </match>
  [...]
</label>

<label @output>
  <match **>
    type copy
    <store>
      @type relabel
      @label @slack
    </store>
    <store>
      @type relabel
      @label @elasticsearch
    </store>
  </match>
</label>

<label @slack>
  <filter **>
    @type grep
  </filter>
  <match **>
    @type slack
  </match>
</label>

<label @elasticsearch>
  <match **>
    @type elasticsearch
    logstash_prefix ${tag}
    #elasticsearch magic
  </match>
</label>

Seems very heavy and would force us to add a lot of config for each new type of notification we could add in the future. (And would duplicate all events, we have more than 1 million hit a day and feel like we shouldn't do that)

Option 2 : use a filter version of slack plugin

In our case we then just need to add a filter within the output for each notification we would like to add :

<source>
  @type forward
</source>
<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key stream
    pattern /^(.+)$/
    # tag: kubernetes.<environment>.<namespace>.<pod>.<container>.<stream>
    tag ${tag}.$1
  </rule>
  @label @input
</match>
<label @input>
  [...]
  <match kubernetes.production.{production,preview}-front-offices.**>
    @type rewrite_tag_filter
    <rule>
      key $.kubernetes.namespace_name
      pattern /^(.+)-front-offices$/
      # tag: my-app.<environment>.front-offices.<container>.<stream>
      tag my-app.$1.front-offices.${tag_parts[4]}.${tag_parts[5]}
    </rule>
    @label @output
  </match>
  [...]
</label>

<label @output>
  <filter kubernetes.production.{production,preview}-front-offices.**>
    @type slack
    <regexp>
      key status
      pattern 500
    </regexp>
  </filter>
  <match **>
    @type elasticsearch
    logstash_prefix ${tag}
    #elasticsearch magic
  </match>
</label>

It feels much cleaner.

What do you think ? Is there any blocking points on your side ?

pbenefice commented 6 years ago

I may add that you could still combine any kinds of filter plugins using the Slack Output which I didn't modify, but it seems nice to also have the choice to use a "light" grep-filter version of the plugin, easier to implement. imho the grep filter seems to be the most appropriate filter to use with this plugin, I feel like it might be the most common use case. At least I think having the possibility to choose is cool, but maybe I miss some things.

I'll wait for your feedback

sonots commented 6 years ago

I still feel it is not the responsibility of slack plugin. Also, it is an abuse of filter plugin.

Notice that filter plugin blocks handling messages. When slack.com is working slow, your fluentd will block all messages because of slack.com. Also, actually the codes

https://github.com/sowawa/fluent-plugin-slack/pull/45/files#diff-9296be3957ef3fe8a4325c9946cd4cf2R53

can possibly lose messages because of slack.com and your elasticsearch can not get the message.

pbenefice commented 6 years ago

Ok, thanks for your feedback. I didn't realize it could block messages because of Slack.

pbenefice commented 6 years ago

I close the PR, thanks for your time @sonots.