netsampler / goflow2

High performance sFlow/IPFIX/NetFlow Collector
BSD 3-Clause "New" or "Revised" License
425 stars 99 forks source link

Detect missing flows using NetFlow5/NetFlow9/IPFIX Sequence Number #116

Closed AlexandreYang closed 1 year ago

AlexandreYang commented 1 year ago

Hi,

We are using goflow2 as library and trying to see if there are ways to to detect that some flows are missing NetFlow5/NetFlow9/IPFIX Sequence Number.

It seems there are currently no easy reliable way to detect if sFlow/NetFlow/IPFIX packets have been dropped between the device/exporter sending the flow and goflow2 collector.

Should goflow2 provide a way to monitor/report if packets are missing using Sequence Number ?


Sequence Number behaviour is a bit different for NetFlow 5 and NetFlow 9:

Sequence Number meaning in NetFlow9: Incremental sequence counter of all export packets sent by this export device; this value is cumulative, and it can be used to identify whether any export packets have been missed. Note: This is a change from the NetFlow Version 5 and Version 8 headers, where this number represented "total flows." nfdump example: https://github.com/phaag/nfdump/blob/b0c7a5ec2e11a683460b312ba192bc00590c4acd/bin/netflow_v5_v7.c#L382-L395

Sequence Number in NetFlow 5, 6, 7, and Version 8: The sequence number is equal to the sequence number of the previous datagram plus the number of flows in the previous datagram. After receiving a new datagram, the receiving application can subtract the expected sequence number from the sequence number in the header to derive the number of missed flows. nfdump example: https://github.com/phaag/nfdump/blob/28ad878ac807e82fb95a77df6fc9b98000bcc81c/bin/netflow_v9.c#L2094-L2106


Since the field SequenceNum is part of FlowMessage, I tried to used it to detect if flows have been dropped, but does not work well for NetFlow 9.

For NetFlow 5, it should work since SequenceNum represent the number of total flows. We can compare the SequenceNum change to actual number of FlowMessage.

For NetFlow 9, it seems that analysing FlowMessage.SequenceNum might not work since template flows are counted in sequence number but not reported as FlowMessage. Example: https://github.com/Graylog2/graylog-plugin-netflow/blob/master/src/test/resources/netflow-data/nprobe-netflow9-3.pcap

lspgn commented 1 year ago

Hello, No, it shouldn't, GoFlow2 isn't designed to do this. This would require to process the samples in order with a key (template ID, source address) which could have performance impact. It is preferable to do so in the database granted the key is stored as well.

AlexandreYang commented 1 year ago

Thanks Louis @lspgn for the quick feedback.

It is preferable to do so in the database granted the key is stored as well.

I'm wondering if it's possible to analyse missing packets at database level due to the fact in NetFlow 9 some packets do not contain flow records, but will still increase the Sequence Number. Example, this packet only contain Data Template, Options Template and Options Data:

image

We would see in database gaps in Sequence Number due to those packets that do not contain flow records.

Maybe I'm missing something 🤔

What do you think?

This would require to process the samples in order with a key (template ID, source address) which could have performance impact.

About the key, should it be (Observation Domain ID, source address) ?

"Incremental sequence counter modulo 2^32 of all IPFIX Data Records sent in the current stream from the current Observation Domain by the Exporting Process." source: https://datatracker.ietf.org/doc/pdf/rfc7119

Packets order could be indeed an issue. nfdump is doing some sequence error detection here, but it seems to assume that packets arrive in order: https://github.com/phaag/nfdump/blob/28ad878ac807e82fb95a77df6fc9b98000bcc81c/bin/netflow_v9.c#L2094-L2106