irino / softflowd

softflowd: A flow-based network traffic analyser capable of Cisco NetFlow data export software.
https://github.com/irino/softflowd
Other
165 stars 29 forks source link

sotflowd sending the same flow sometimes, twice, and sometimes three times #51

Closed mcury1985 closed 1 year ago

mcury1985 commented 1 year ago

I'm sending Netflow data from pfSense (softflowd), to a Graylog server. I'm using Graylog to sum the amount of data passed through an interface, to get a list of top Talkers in a determined time frame.

I noticed that the sum in Graylog is not matching the amount of data because the same flow is sent sometimes twice, sometimes three times. What differs from these sent flows from softflowd to Graylog are the nf_input_snmp, nf_snmp_output and nf_flow_packet_id fields.

I reported it too as bug in redmine pfsense, https://redmine.pfsense.org/issues/14747 because I'm not sure if this is a bug with softflowd or a bug with the pfsense softflowd package.

Here is an example of a flow that softflowd sent three times to Graylog server: clipboard-202309041446-etky4

Is there a way to force the flow to be sent once ? I can't make Graylog search to check a single snmp input since it is complete random..

Thanks

mcury1985 commented 1 year ago

It seems that the problem is related to VLAN interfaces. I've been doing some tests and if you set softflowd to collect only from non VLAN interfaces, the problem doesn't happen.

These two interfaces are making softflowd to act in a weird way. igc1.10 igc1.20

In case you need more details, just let me know. Thanks.

irino commented 1 year ago

I don't know pfsense's detail If vlan tracking option (-T vlan) is enabled, the flows on different vlan are distinguished as different flow even if 5 tupples in IP packets are same. I don't know what options of softflowd is specfied by pfSense. Please ask pfSense community.

mcury1985 commented 1 year ago

Hello irino, thanks for answering.

The command used in pfsense is: (-T option is being used as we can see below)

[23.05.1-RELEASE][root@pfsense.home.arpa]/root: ps aux | grep soft nobody 93788 0.0 0.1 13440 4344 - Is 16:44 0:03.51 /usr/local/bin/softflowd -i 1:igc0 -n 192.168.255.253:2055 -v 9 -T full -A sec -p /var/run/softflowd.igc0.pid -c /var/run/softflowd.igc0.ctl nobody 94010 0.0 0.1 13440 4332 - Is 16:44 0:07.05 /usr/local/bin/softflowd -i 2:igc1.10 -n 192.168.255.253:2055 -v 9 -T full -A sec -p /var/run/softflowd.igc1.10.pid -c /var/run/softflowd.igc1.10.ctl root 59414 0.0 0.1 12768 2428 0 S+ 11:12 0:00.00 grep soft

I disabled softflowd in interface igc1 and igc1.20, and now I'm only collecting data from igc0 and igc1.10 By doing like this, I'm able to collect the correct data from netflow.

Traffic between igc0 and igc1.10 is doubled because pfsense sends data from both interfaces, to circumvent that, I'm filtering intervlan data only from nf_input_snmp:1, thus ignoring the data from nf_input_snmp:2 which is repeated.

Traffic from these interfaces to WAN (not intervlan anymore), doesn't need any tweaking, it works perfectly since I'm not collecting data from WAN interfaces.

mcury1985 commented 1 year ago

I think the problem happens when you listen on the parent interface along with VLANs inside of it.

MGMT - igc1 WIFI - igc1.10 LAN - igc0

See, connections from WIFI igc1.10 to LAN igc0, were being collected sometimes three times, sometimes two, completely random..

As soon as I disabled the parent interface (igc1), and now listening only on igc1.10 and igc0, I'm not getting the same netflow data sent three times, only twice if the traffic is intervlan (which is expected since pfsense will collect from both interfaces).

irino commented 1 year ago

nobody 93788 0.0 0.1 13440 4344 - Is 16:44 0:03.51 /usr/local/bin/softflowd -i 1:igc0 -n 192.168.255.253:2055 -v 9 -T full -A sec -p /var/run/softflowd.igc0.pid -c /var/run/softflowd.igc0.ctl nobody 94010 0.0 0.1 13440 4332 - Is 16:44 0:07.05 /usr/local/bin/softflowd -i 2:igc1.10 -n 192.168.255.253:2055 -v 9 -T full -A sec -p /var/run/softflowd.igc1.10.pid -c /var/run/softflowd.igc1.10.ctl 2 softflowd processes run. Each process aggregate to flow individually. So you saw different flow whose 5 tupples are same. The -i option (interface option) depnds on pcap library. If you don't specify interface name, softflowd collect packets from all interface. This behaviior depnds on pcap library. I cannot fix this specification.

mcury1985 commented 1 year ago

Thank you irino for the insight on this matter. I'll keep collecting from igc0 and igc1.10 only and use the workaround (filter by nf_input_snmp:1 for intervlan traffic) for now.

Much appreciated.