logstash-plugins / logstash-codec-netflow

Apache License 2.0
78 stars 87 forks source link

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

Open wrigby opened 9 years ago

wrigby commented 9 years ago

This is a bit of an obscure one, and I don't know that it's actually causing anyone problems in production. I think it's very minor, but wanted to get it recorded somewhere and start a discussion in case it needs changing.

When NetFlow v9 template records come in, they're stored in a hash, with the key being a concatenation of the NetFlow Source ID field and the Template ID. (see https://github.com/logstash-plugins/logstash-codec-netflow/blob/master/lib/logstash/codecs/netflow.rb#L144-L146)

However, the NetFlow spec states that NetFlow collectors should use the combination of the Source ID field and the IP address of the flow exporter:

Source ID A 32-bit value that identifies the Exporter Observation Domain. NetFlow Collectors SHOULD use the combination of the source IP address and the Source ID field to separate different export streams originating from the same Exporter.

(from RFC 3954)

The Source ID field is a 32-bit value that is used to guarantee uniqueness for all flows exported from a particular device. (The Source ID field is the equivalent of the engine type and engine ID fields found in the NetFlow Version 5 and Version 8 headers). The format of this field is vendor specific. In the Cisco implementation, the first two bytes are reserved for future expansion, and will always be zero. Byte 3 provides uniqueness with respect to the routing engine on the exporting device. Byte 4 provides uniqueness with respect to the particular line card or Versatile Interface Processor on the exporting device. Collector devices should use the combination of the source IP address plus the Source ID field to associate an incoming NetFlow export packet with a unique instance of NetFlow on a particular device.

(from Cisco's NetFlow v9 documentation) (emphasis added for clarity)

bodgit commented 9 years ago

I'm fairly sure I did that when I initially wrote this filter^Wcodec.

I think it's fairly easy to get a collision by just having two or more bits of identical hardware as they will probably use the same source ID. Also from my experience devices tend to start with a template ID of 256 and count up.

However, if they're identical hardware chances are the templates will be identical also so you'll just play last-man-wins as and when they refresh their templates but I agree, the source IP should really be used as part of the hash key.

wrigby commented 9 years ago

@bodgit that's what I've seen in practice - all of our devices are exporting the same templates anyway, so it's a non-issue right now for my deployment. It's something I noticed when hacking on support for MPLS labels (#5), so I figured I would record it here for future reference in case it does end up biting someone.

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

bodgit commented 9 years ago

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

Of course, when this was originally a filter it had access to the original event. That explains things.

wrigby commented 9 years ago

Ah, it makes much more sense now. I had a suspicion that was the case, given the comments that are in there. Thanks for the clarification!

jorritfolmer commented 9 years ago

What have been the reasons to change this plugin from filter to codec? I couldn't find any through Google.

Not having the source IP information available makes this codec ignore a "should" or "recommended" statement in the RFC. Currently, if you're unlucky enough to find yourself in a heterogeneous netflow environment, you cannot simple erect one Logstash instance, you're forced to create multiple individual Logstash instances to avoid netflow template collisions. I don't think we should settle for that, unless there are good reasons of course.

wrigby commented 9 years ago

I think it makes sense for it to be a codec, by the idealized definition of a codec and a filter (a codec being the component that deals with serializing/deserializing data to/from a wire protocol), but it unfortunately gets us into this ambiguous template key situation.

From my naive point of view, I think the best solution may be to provide a mechanism in the logstash core for inputs to provide metadata to the codecs that are decoding their payloads.

jorritfolmer commented 9 years ago

Ah yes ok that makes sense, thanks for your explanation. Perhaps it is time to open an issue for this in the main logstash repository?

jorritfolmer commented 9 years ago

For easy reference: current issues (both open/closed) already in logstash repo that touch on codecs and/or metadata:

elastic/logstash#3725: Split Codecs into Encoders and Decoders elastic/logstash#3694: Use of @metadata in 2.0 elastic/logstash#3607: Codecs should support 'target' option elastic/logstash#3218: @metadata not shown in rubydebug

jorritfolmer commented 9 years ago

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per https://github.com/logstash-plugins/logstash-codec-netflow/issues/21#issuecomment-149832184:

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

wrigby commented 8 years ago

Not sure how I missed all the work you put into this @jorritfolmer - looks great; I'll try to test it out in the next week. I came across elastic/logstash#4289, which proposes a similar fix.

jorritfolmer commented 8 years ago

There has been an update to elastic/logstash#4289. With regard to the netflow codec needing packet metadata, I quote Guy Boertje from that thread:

If this is true, you are better off creating a netflow input that includes the current codec logic directly in the input - you then have access to any context directly. It can be based on the udp input.

jorritfolmer commented 8 years ago

Following up on the suggestion of @guyboertje, I've started with the transition from codec to logstash-input-netflow in the branch logstash-input-netflow, based on the udp input plugin.

It works!

There is still work to be done however in the rspec department, and we'll probably need to support more than udp only.

guyboertje commented 8 years ago

@jorritfolmer - good news indeed. Ping me if you need any help.

wrigby commented 8 years ago

@jorritfolmer @guyboertje - Let me know if you guys need help, too. I'm a Ruby noob, but I definitely wouldn't mind helping out. Unfortunately, I'm at a different company now (outside of the telco space), so I don't have any test data anymore :(

jorritfolmer commented 8 years ago

Small status update:

I've finished work on logstash-input-netflow, which is waiting as a PR here

However @jordansissel pointed out that this move to input plugin isn't necessary with the availability of something called Identiy Map Codec, which is meant to support a use-case like ours here, while keeping the netflow plugin a codec. I haven't had time to further dig into Identity Map Codec though. It will also require an update to logstash-input-udp.

As I understand it, we should be passing src_ip+port as "identity" to the Identity Map Codec, which will clone a codec for each src_ip+port combination. This will take care of keeping Netflow templates separate and prevent template corruption and overwriting as is the case with e.g. issue #21.

It seems we could copy paste identity_map_codec.rb from the logstash-codec-multiline, as it isn't available in the main Logstash repo.

jorritfolmer commented 8 years ago

Great! This issue and issue #21 are gone with identity map codec. It doesn't require modifications to our codec, just a couple of lines in logstash-input-udp. I'll create a PR there next week or so.

jorritfolmer commented 7 years ago

Update: This will be addressed in the migration to logstash-plugins/logstash-input-netflow#1. It will allow us to own the network socket directly use the necessary IP addresses as a key. Probably 2017 though.

ssmlt commented 6 years ago

Hello, is there any update regarding this issue and any plans to release logstash-input-netflow? Thanks.

jorritfolmer commented 6 years ago

Issue is still there. Logstash-input-netflow still waiting for my initial PR to be merged.

joriws commented 6 years ago

I see udp.rb (v6.2) has changed so above metadata patch is no longer directly appliable. It worked with 5.5. So question is what is the status now with 6.2.

Do we need meta-data? I use latest elastisearch and it is decoding fine single stream, I have not tested with multiple sources yet. cache-save-file does not include IP-addresses of template sender so I doubt it works?

Please update front-page or readme.md the information how multi-ip-templates are to be made working

humblemumble commented 6 years ago

hi, could you please provide metadata udp input patch also for 6.2 version ?

evilelf666 commented 5 years ago

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per #21 (comment):

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

I've found your patch for the more recent logstash udp.rb code and applied it to my platform (processing 2500 flows / sec) and it fixed all the bandwidth/pps and invalid protocol/incoherent data display (Mostly cisco environment, same models) for my environment. I'm running the latest github data dump in a docker... this clearly resolved the reporting from the 7609s and ASRs in my environment.

Thanks a lot for this metadata patch man. This saved me from going bonkers.

robcowart commented 5 years ago

This issue will be addressed once the following PRs are merged and released for the...

Logstash UDP Input: https://github.com/logstash-plugins/logstash-input-udp/pull/46 Logstash Netflow Codec: https://github.com/logstash-plugins/logstash-codec-netflow/pull/187

evilelf666 commented 5 years ago

Thank you @robcowart

As always, your work is very appreciated.

ChrisDeh commented 4 years ago

Are there any News on this? Both referenced PRs are closed but not merged.