fujiwara / fluent-plugin-suppress

fluentd plugin to suppress same messages.
Other
17 stars 13 forks source link

Need to log suppression event count without logging event #16

Open jasonreevessimmons opened 1 year ago

jasonreevessimmons commented 1 year ago

Hello,

I had a requirement to log a count of suppression events within the last interval. Using a @log_level of debug in the plugin creates a message in the log for every suppressed event along with the full event. This is fine for development, but I didn't want to store these suppressed events - just get a count of them when they occur.

My solution was this patch, which you may freely use:

--- ./lib/fluent/plugin/filter_suppress_orig.rb 2023-03-31 09:41:52.543591973 -0500
+++ ./lib/fluent/plugin/filter_suppress.rb  2023-03-31 11:01:40.649203042 -0500
@@ -13,6 +13,7 @@
       super
       @keys  = @attr_keys ? @attr_keys.split(/ *, */) : nil
       @slots = {}
+      @suppress_counter = 0      
     end

     def filter_stream(tag, es)
@@ -30,15 +31,23 @@

         # expire old records time
         expired = time.to_f - @interval
+        interval_expired = false
         while slot.first && (slot.first <= expired)
+          interval_expired = true
           slot.shift
         end

         if slot.length >= @num
           log.debug "suppressed record: #{record.to_json}"
+          @suppress_counter += 1
           next
         end

+        if @suppress_counter > 0 && interval_expired
+          log.warn "suppressed #{@suppress_counter} events within the last #{@interval} seconds for tag #{tag}"
+          @suppress_counter = 0
+        end
+
         if @slots.length > @max_slot_num
           (evict_key, evict_slot) = @slots.shift
           if evict_slot.last && (evict_slot.last > expired)

If @log_level is set to warn, then Fluentd will emit entries like this in its log file on suppression:

2023-03-31 16:17:59 +0000 [warn]: #0 suppressed 9 events within the last 1 seconds for tag unstructured
2023-03-31 16:17:59 +0000 [warn]: #3 suppressed 3 events within the last 1 seconds for tag unstructured
2023-03-31 16:17:59 +0000 [warn]: #7 suppressed 1 events within the last 1 seconds for tag unstructured
2023-03-31 16:17:59 +0000 [warn]: #6 suppressed 2 events within the last 1 seconds for tag unstructured
2023-03-31 16:17:59 +0000 [warn]: #5 suppressed 6 events within the last 1 seconds for tag unstructured

These events can then be sent to OpenSearch, alerted on within Fluentd, or any number of possibilities.

It should be noted also that events are only suppressed on a per-worker basis. So if Fluentd receives 10 events and the threshold is 5, events will only be suppressed if more than 5 of those events go to a single worker.

Enjoy!

fujiwara commented 1 year ago

@jasonreevessimmons Thank you!

It seems so good. Please create a pull request.

But these warning logs should be optional. I think we need introduce a new configuration flag to keep compatibility.

jasonreevessimmons commented 1 year ago

For that reason I didn't submit a pull request. I don't have a "big picture" understanding of the project, so I'm happy to let you take my code and integrate it how you'd like.