logstash-plugins / logstash-codec-netflow

Apache License 2.0
78 stars 87 forks source link

Fortios 5.6.x WAN/LAN directions not separated anymore #138

Closed Sjaak01 closed 6 years ago

Sjaak01 commented 6 years ago

Hi,

I'm not sure if this is actually an issue with the codec but I thought I'd ask anyway.

We have a bunch of Fortigate devices and on OS 5.2.4 all netflow traffic would always record a LAN IP as the source, making it very easy to check how much up/down traffic there is from a certain device.

However on 5.6.x this suddenly isn't the case anymore and WAN IP's are now recorded in the source field as well. This make the data totally useless as you can't filter data on the source IP anymore.

Is there any way to figure out whether this is the codec (doubt it) or the Fortigate?

jorritfolmer commented 6 years ago

What does Wireshark say?

Sjaak01 commented 6 years ago

Hi Jorrit,

I emailed you about this issue and the application ID on the 11th. I attached a .pcap.

I can see DNS requests that have both the LAN and WAN address logged as the srcaddr. I believe only the LAN address should be logged as source as all DNS requests originate from the Fortigate. Looking at the results I get in Elastic this is indeed the case on the 5.2.4 firmware.

jorritfolmer commented 6 years ago

About the DNS requests:

1) A LAN client queries the Fortigate: 1 flow is generated (src LAN client, dest Fortigate LAN IP) 2) The Fortigate queries a DNS server on the internet: 1 flow is generated (src Fortigate WAN IP, dest DNS server public IP)

To me it looks like the FortiOS 5.2.4 situation was incorrect because you were missing out on flows in step 2.

Sjaak01 commented 6 years ago

Maybe port 53 is a bad example as the Fortigate is handling all the DNS requests. But e.g. port 443 shows the same.

On 5.2.4 src would be a LAN IP, dst would be WAN, and xlate would be the router gateway. That data allows proper graphs in Kibana that exactly show up/down per src (or the other way around) and which LAN IP accessed which WAN IP.

But if the data from 5.2.4 is in fact incorrect, than how would you ever use netflow data to create something similar? The data wouldn't make sense? For example if I make a sum of in.bytes and out.bytes with the data from the 5.4 firmware, in and out volume is exactly the same. I know this is incorrect because I downloaded a large .iso file.

Another weird thing is that as per the screenshot I emailed, src and dst records are exactly the same when I view the netflow data in Elastic/Kibana's discovery.

jorritfolmer commented 6 years ago

1) The fact that in_bytes equals out_bytes is suspicious, so I wouldn't trust that to sum with 2) There are 2 flow records for every connection 3) From the input_snmp and output_snmp you can determine the in/out interfaces, and thus direction. 4) Bytes differ for each flow within the flow-pair, so you can determine up and download volume

So for example if you have one flow with:

and its opposite flow:

This leads to the conclusion that you've downloaded 340k in this connection.

Sjaak01 commented 6 years ago

Thanks for the explanation. The weird thing is that in Elastic the data for src/dst is exactly the same for all values. I don't they should be the same so somewhere something is going wrong.

image

jorritfolmer commented 6 years ago

There should be a difference if you only filter for data with input_snmp=8. The example I gave above is from one of the flows in your pcap, so at least for that flow there is a workable difference in the data.

Sjaak01 commented 6 years ago

Jorrit,

Looking at the records in Elastic, in.bytes and out.bytes always have the same value.

image

jorritfolmer commented 6 years ago

You have to look at the flow in the other direction for the reverse bytes. These flows have other inbound and outbound interfaces, so you can distinguish between traffic from lan to wan and vice versa. So select flows with a certain input or output snmp and look at the src/dest then.

I'd consider it a bug that in_bytes == out_bytes, but this is an issue in the data we get from the Fortigate.