phaag / nfdump

Netflow processing tools
Other
765 stars 201 forks source link

pcap import does not result in correct/full precision timestamps in nfdump output. #302

Closed kgardas closed 3 years ago

kgardas commented 3 years ago

Hello,

while testing nfdump on pcap files produced by tshark 3.2.3 (provided by ubuntu repo, see below) I've discovered that although pcap files contains full packets timestamps (verified opening pcap in wireshark 3.2.3 provided by ubuntu repo) the output of nfdump is not able to show them. The result is that then some packets are printed with the same timestamp and additionally also packets output is not well ordered following timestamps (following is from nfdump -R dir | grep "10.0.10" | less copy & paste so do not be confused by short output):

      1 2021-08-17 10:02:56.380     3.139 TCP         10.0.10.41:1099  ->        10.0.30.5:38317        5      206     1
      2 2021-08-17 10:02:57.461     9.962 TCP          10.0.30.5:8080  ->       10.0.10.41:56540        7     7126     1
      3 2021-08-17 10:02:56.376    11.042 TCP          10.0.30.5:38317 ->       10.0.10.41:1099         7      435     1
      4 2021-08-17 10:02:57.572    10.837 TCP          10.0.30.5:2000  ->       10.0.10.41:56319       46   108205     1
      5 2021-08-17 10:02:57.461     9.962 TCP         10.0.10.41:56540 ->        10.0.30.5:8080        10      509     1
      6 2021-08-17 10:02:57.572    10.838 TCP         10.0.10.41:56319 ->        10.0.30.5:2000        60     4989     1
      7 2021-08-17 10:02:56.380     3.139 TCP         10.0.10.41:1099  ->        10.0.30.5:38317        5      206     1
      8 2021-08-17 10:02:57.461     9.962 TCP          10.0.30.5:8080  ->       10.0.10.41:56540        7     7126     1
      9 2021-08-17 10:02:56.376    11.042 TCP          10.0.30.5:38317 ->       10.0.10.41:1099         7      435     1
     10 2021-08-17 10:02:57.572    10.837 TCP          10.0.30.5:2000  ->       10.0.10.41:56319       46   108205     1
     11 2021-08-17 10:02:57.461     9.962 TCP         10.0.10.41:56540 ->        10.0.30.5:8080        10      509     1
     12 2021-08-17 10:02:57.572    10.838 TCP         10.0.10.41:56319 ->        10.0.30.5:2000        60     4989     1

shows that the packets ordering is wrong. It should be (describing by less lines): 3, 1, 2 -- considering only first 3 lines to not complicate things more. This is how actual invocations go. Also there is a problem with not enough precision in nfdump provided timestamps. E.g. packets on lines: 3 and 9 do have the same time and this also applies to packets on lines 8 and 11, but they are different and in fact happened in tight sequence hence more precise timestamps are needed here to actually detect packets ordering.

Or am I missing some magic command line/configuration option here which would correct nfdump behavior? Please let me know or let me know what else should I provide to better debug this issue. My pcapng file even xz compressed is 75MB long and it may be provided if there is anybody interested in using it for further debugging. Thanks!

Version: nfdump from git obtained 2021-08-13 Configured with: ./configure --enable-readpcap --enable-nfpcapd --prefix= Platform: Ubuntu 20.04 with GCC 7,8,9 and 10 and Clang 9 and 10 installed (all provided by Ubuntu repositories) Compiled with: it looks like configure preferred clang and the default is 10 version here. Run with:

$ nfpcapd -r <file name>.pcap -l <dir name>
$ nfdump -R <dir name>
phaag commented 3 years ago

Hi, There are a few things to note here:

Timestamps

Netflow v5/v9 in general used to be defined in msec resolution. Exporters (routers etc.) send them in msec resolution. This means, that internally the data format of nfdump since ever used msec resolution. Yes - there are new IPFIX elements such as 156 and 157 today and pcaps also have better solution. Theoretically nfpcapd could deal with them but nfpcapd is a collector introduced later in time and uses the same data format as nfcapd, therefore anything better than msec is discarded.

Ordering

Given, that msec is the best resolution for now, flows may be ordered according timestamps tstart, tend etc. see nfdump(1) for more details. In you example, use nfdump -Otstart -R <dir name> to get an ordered list of flows.

Explanation of ordering

Nfpcapd needs to build an internal flow cache where each packet is accounted to the matching flow. Each packet is accounted in the order it shows up on the wire e.g. read from pcap. There are now several criteria, when a flow gets exported, which means exported into the data file:

Active/inactive timeout scans run every 10s. Whatever matches is exported. Therefore flows are not stored in the order they start TCP/SYN but when a criteria hits for exporting. In the second case a flow may be broken up into several smaller flows, which can be later aggregated with nfdump -a -R .. in order to get the complete picture.

The strange thing of your output is more that lines 3 and 9 are identical, which I can not explain without having the pcap file.

Feel free to share the pcap with me, if you can put it somewhere on a fileshare for downloading. You can also ping me off list, you find my email in the AUTHORS file.

phaag commented 3 years ago

The only reason I see for the exact doubling of the flows: Did you run nfpcapd twice without cleaning up the flow directory? nfpcapd appends flows to already existing flow files. The reason is offline pcap processing, where processing many pcap files in a row may result in flows within the same time range.

kgardas commented 3 years ago

Hi, thanks a lot for your support over the email. Indeed, I guess not cleaning up the directory was very much likely a cause of the issue which caught your eye. As you have basically explained everything to me, I guess this may be closed now and kept just for the record. Thanks a lot!