phaag / nfdump

Netflow processing tools
Other
765 stars 201 forks source link

IPFIX (V10) support for dot1q VLAN IDs #515

Closed jbemmel closed 5 months ago

jbemmel commented 5 months ago

Hi,

As stated in another issue, nfdump/nfcapd works mostly fine for IPFIX (V10) flow reports. However, it is still lacking some fields:

IP version 60 Ingress Physical Interface 252 Egress Physical Interface 253 Dot1q VLAN ID 243 Dot1q Customer VLAN ID 245 Post Dot1q VLAN ID 254 Post Dot1q Customer VLANId 255

The dot1q fields were added to v9 in 2019, but not for v10. See https://www.rfc-editor.org/rfc/rfc7133.html#page-26 for specs

Attached pcap trace contains a sample with L2 MAC addresses and VLAN tag 100 for a simple ICMP flow, as produced by Nokia SR OS ipfix-2055.zip

image

jbemmel commented 5 months ago

See https://github.com/phaag/nfdump/commit/ecf9b12ce55908930c659bcd44cdbc1b584ee025 for v9 support

phaag commented 5 months ago

The implementation of the new fields takes a bit more time.

phaag commented 5 months ago

@jbemmel - I see, there could be a potential need for the others except IP version 60. You already have that implicit with the IP addresses being IPv4 or IPv6. Can you elaborate the use case for that, so I can see your needs.

phaag commented 5 months ago

@jbemmel - Could you please specify what processing you need with the new elements?

jbemmel commented 5 months ago

See https://github.com/jbemmel/IXP-Manager/blob/add-goflow2-support/tools/runtime/sflow/goflow2-to-rrd-handler#L131

My use case is populating IXP Manager similar to how sflowtool is being used. I was trying to use goflow2 but it does not support V10 IPFIX

The IP version could possibly be inferred from the ipv4/v6 address, but the data is there and it's simpler to just take that byte. It's less critical than the missing VLAN fields

phaag commented 5 months ago

I added an ip_version tag in the json output format. Just tell me, where you need this info in which nfdump output. Just a single byte is pretty much a waste of space in the extension backend. If I can provide you that information otherwise, that would be a better solution.

The VLAN fields as well as the physical interface IDs are integrated in the latest master repo and accepted vor v9 and IPFIX.

jav4 commented 5 months ago

Some devices (i.e. Cisco) don't populate the IP version field whiel capturing ethernet traffic (layer 2). I've submitted a PR to add the ethertype field (256) for that same use case.

phaag commented 5 months ago

I have seen the PR. The point is the same as for the IP version - just 2 bytes for a backend extension is pretty much a waste of space. I could rearrange the code, such, that all the physical elements (phys ID, ether type and dot1q) go into the same extension. So please do not yet use the master branch, as the code will change for that.

phaag commented 5 months ago

PR resolved and integrated in layer2 extension. Please check, if this works for you.

jbemmel commented 5 months ago

There is an issue with this PR (not sure if it was present before)

jeroen@jvb-vm:~/srlinux/nfdump$ nfdump -r /tmp/nfcapd.202404011450 -o raw

Flow Record: 
  RecordCount  =                 1
  Flags        =              0x02 NETFLOW v10, Sampled
  Elements     =                10: 1 2 3 8 9 12 15 36 38 39 
  size         =               220
  engine type  =                 0
  engine ID    =                 0
  export sysid =                 1
  first        =     1711983095784 [2024-04-01 14:51:35.784]
  last         =     1711983095784 [2024-04-01 14:51:35.784]
  received at  =     1711983171593 [2024-04-01 14:52:51.593]
  proto        =                58 ICMP6
  tcp flags    =              0x00 ........
  ICMP         =              133.0  type.code
  in packets   =                 1
  in bytes     =                78
  src addr     =           0.0.0.0
  dst addr     =           0.0.0.0
  src addr     =  fe80::a8c1:abff:fe2b:2f14
  dst addr     =           ff02::2
  bgp next hop =           0.0.0.0
  bgp next hop =                ::
  ip exporter  =      172.30.201.3
  in src mac   = aa:c1:ab:2b:2f:14
  out dst mac  = 00:00:00:00:00:00
  in dst mac   = 33:33:00:00:00:02
  out src mac  = 00:00:00:00:00:00
  ingress VRF  =                 1
  egress VRF   =                 0
  vlanID       =               100
  post vlanID  =                 0
  custID       =                 0
  post custID  =                 0
  ingress IFid =        1610899521
  egress IFid  =        4294967295
jeroen@jvb-vm:~/srlinux/nfdump$ nfdump -r /tmp/nfcapd.202404011450 -o json
[
{
    "type" : "FLOW",
    "export_sysid" : 1,
    "ip_version" : "4",
    "src4_addr" : "0.0.0.0",
    "dst4_addr" : "0.0.0.0",
    "src_geo" : "",
    "dst_geo" : "",
    "bgp4_next_hop" : "0.0.0.0",
    "ip4_router" : "172.30.201.3",
    "in_src_mac" : "aa:c1:ab:2b:2f:14",
    "out_dst_mac" : "00:00:00:00:00:00",
    "in_dst_mac" : "33:33:00:00:00:02",
    "out_src_mac" : "00:00:00:00:00:00",
    "ingress_vrf" : "1",
    "egress_vrf" : "0",
    "vlanID" : 100,
    "post_vlanID" : 0,
    "cust_vlanID" : 0,
    "post_cust_vlanID" : 0,
    "sampled" : 1
}]

The 'json' output format wrongly picks IPv4 for an IPv6 flow

phaag commented 5 months ago

Well - it's not wrong either. Your exporter send both IP address records type - IPv4 and IPv6. The json output assumes, if v4 is present, it prints v4. There is an ambiguaty which IP version is correct. I could print both, if both are present, but this may violate the unique key of the json record. In the sample pcap you have records with both IPv4 and IPv6, none of them is filled and the IP version is set to 0. It's a bit difficult with ambigious data.

jav4 commented 5 months ago

Aggregation (-A) doesn't work with any option now:

nfdump -r nfcapd.202404011805 -o fmt:'%eth %ismc %idmc %byt' -A insrcmac Number of workers should not be greater than number of cores online. 4 is > 1 Date first seen Duration In src MAC Addr Packets Bytes bps Bpp Flows Summary: total flows: 1930, total bytes: 340.2 M, total packets: 539020, avg bps: 4.9 M, avg pps: 976, avg bpp: 631 Time window: 2024-04-10 01:44:11 - 2024-04-10 01:53:23 Total flows processed: 1930, passed: 1930, Blocks skipped: 0, Bytes read: 332136 Sys: 0.0109s User: 0.0000s Wall: 0.0035s flows/second: 545375.4 Runtime: 0.0051s

Without aggregation, works fine. This specific file has >20 source MAC addresses:

nfdump -r nfcapd.202404011805 -o fmt:'%ismc' | sort -u | wc -l

phaag commented 5 months ago

I can not confirm this. Please note: for -A insrcmac any -o fmt... does not make sense. If you aggregate the in src mac addr, it means, that only that element is available after the aggregation. Therefore you get an output like:

% src/nfdump/nfdump -r tmp  -A insrcmac                                                                                                                                                                                                                                                  Darwin 23.4.0
Date first seen         Duration            In src MAC Addr   Packets    Bytes      bps    Bpp Flows
2011-03-12 20:32:52.175     02:58:03.561  00:0f:1f:a7:92:88       123    19969       14    162    29
2011-03-12 20:31:31.754     02:48:04.635  00:0c:29:fc:ba:8d     18527    4.5 M     3578    243  1225
2011-03-12 20:27:24.365     00:38:45.815  00:18:8b:ae:44:6f      3697   313514     1078     84  1683
2011-03-12 20:27:31.015     03:05:39.417  00:16:d3:4b:07:0d     35800    1.7 M     1195     46   200
2011-03-12 20:27:25.360     03:05:45.906  00:16:47:9d:f2:d4     96241    9.4 M     6767     97  9507
2011-03-12 20:27:27.641     03:05:40.800  00:16:47:9d:f2:d3     40590    3.6 M     2602     89 28809
2011-03-12 20:27:24.120     03:05:46.947  00:90:f5:3f:40:cd     54363   21.2 M    15235    390  2913
2011-03-12 20:27:26.512     03:04:53.552  00:0b:97:de:20:b2     23623    1.2 M      835     49 18876
2011-03-12 20:29:20.725     02:27:46.392  00:1c:23:49:30:74       175    13360       12     76   157
2011-03-12 20:30:45.370     02:58:54.373  00:0c:29:43:36:e1        17     3893        2    229    17
2011-03-12 20:27:24.142     03:05:46.847  00:26:9e:83:a2:30    131387   11.0 M     7889     83 10847
2011-03-12 20:52:00.032     02:13:40.541  00:0c:29:ba:7f:34       287    19065       19     66   213
2011-03-12 20:27:24.348     03:05:46.259  00:16:47:9d:f2:d0    208833   18.8 M    13478     89 161010
2011-03-12 20:45:29.049     02:33:21.910  00:24:7e:6b:94:9a    727096   33.6 M    29194     46 695087
2011-03-12 20:27:24.121     03:05:16.392  00:23:df:97:4e:12     36643    5.5 M     3971    150   384

nfdump will ignore the -o fmt.. argument.

If you only aggregate one element you can also use

% src/nfdump/nfdump -r tmp -s insrcmac                                                                                                                                                                                                                                                   Darwin 23.4.0
Top 10 In Src Mac ordered by flows:
Date first seen             Duration     Proto        In Src Mac    Flows(%)     Packets(%)       Bytes(%)         pps      bps   bpp
2011-03-12 20:27:24.800     03:05:45.798 any   00:24:e8:cb:85:80    1.7 M(38.1)    1.8 M(18.1)   85.7 M( 2.1)      161    61482    47
2011-03-12 20:45:29.049     02:33:21.910 any   00:24:7e:6b:94:9a   695087(15.7)   727096( 7.3)   33.6 M( 0.8)       79    29194    46
2011-03-12 20:27:24.118     03:05:46.711 any   00:16:47:9d:f2:ce   508649(11.5)    3.1 M(31.2)    3.5 G(85.1)      277    2.5 M  1117
2011-03-12 20:29:13.697     03:03:57.592 any   00:1c:23:9e:fb:86   207625( 4.7)   256755( 2.6)   15.6 M( 0.4)       23    11298    60
2011-03-12 20:28:34.154     03:04:23.745 any   58:b0:35:f8:34:2c   176531( 4.0)   370041( 3.7)   21.9 M( 0.5)       33    15820    59
2011-03-12 20:27:24.348     03:05:46.259 any   00:16:47:9d:f2:d0   161010( 3.6)   208833( 2.1)   18.8 M( 0.5)       18    13478    89
2011-03-12 20:27:24.206     03:05:47.077 any   00:0c:29:be:c9:5d   148113( 3.3)   321594( 3.2)   48.2 M( 1.2)       28    34586   149
2011-03-12 20:27:43.657     02:23:06.893 any   00:0c:29:2a:a0:08   126891( 2.9)   143870( 1.4)    7.0 M( 0.2)       16     6482    48
2011-03-12 20:27:27.532     03:05:43.754 any   00:16:47:9d:f2:d1   110142( 2.5)   327762( 3.3)   51.4 M( 1.3)       29    36885   156
2011-03-12 20:27:24.150     03:05:46.917 any   00:16:47:9d:f2:cf    98304( 2.2)   314931( 3.2)   66.3 M( 1.6)       28    47570   210
Summary: total flows: 4433024, total bytes: 4.1 G, total packets: 9.9 M, avg bps: 2.9 M, avg pps: 891, avg bpp: 409
Time window: 2011-03-12 20:27:24 - 2011-03-12 23:33:11
Total flows processed: 4433024, passed: 4433024, Blocks skipped: 0, Bytes read: 646173196
Sys: 0.0508s User: 0.2055s Wall: 0.1599s flows/second: 27719768.2 Runtime: 0.1606s

which gives you the statistics about the mac address, which you can sort about flow, packets or bytes.

Therefore:

% src/nfdump/nfdump -r tmp -q -A insrcmac | wc -l
      77

which is identical to

% src/nfdump/nfdump -r tmp -s insrcmac -n 0 -q | wc -l
      77

and

% src/nfdump/nfdump -r tmp -o fmt:'%ismc' -q | sort -u | wc -l
      77

The data in this test represents the netflow representation of the public pcap: https://download.netresec.com/pcap/maccdc-2011/maccdc2011_00013_20110312202724.pcap.gz converted to netflow.

phaag commented 5 months ago

If you think, your flow file does not produce the correct results, feel free to send it to me by email ..

jbemmel commented 5 months ago

Well - it's not wrong either. Your exporter send both IP address records type - IPv4 and IPv6. The json output assumes, if v4 is present, it prints v4. There is an ambiguaty which IP version is correct. I could print both, if both are present, but this may violate the unique key of the json record. In the sample pcap you have records with both IPv4 and IPv6, none of them is filled and the IP version is set to 0. It's a bit difficult with ambigious data.

This is not intended as criticism, I'm only sharing how devices behave in my world and the consequences of that. In my case, the exporter includes the IP version field in its report - so I think the JSON formatter should either dump all fields, or if the IP version field is present it should pick the corresponding src/dst address fields (instead of assuming v4)

Noting that it may be possible to have non-IP flows - I have seen reports with IP version "0"

jav4 commented 5 months ago

If you think, your flow file does not produce the correct results, feel free to send it to me by email ..

Done.

Looks wierd:

nfdump -r tmp -A insrcmac

Doesn't work.

nfdump -r tmp -s insrcmac

Works.

phaag commented 5 months ago

Well - it's not wrong either. Your exporter send both IP address records type - IPv4 and IPv6. The json output assumes, if v4 is present, it prints v4. There is an ambiguaty which IP version is correct. I could print both, if both are present, but this may violate the unique key of the json record. In the sample pcap you have records with both IPv4 and IPv6, none of them is filled and the IP version is set to 0. It's a bit difficult with ambigious data.

This is not intended as criticism, I'm only sharing how devices behave in my world and the consequences of that. In my case, the exporter includes the IP version field in its report - so I think the JSON formatter should either dump all fields, or if the IP version field is present it should pick the corresponding src/dst address fields (instead of assuming v4)

Noting that it may be possible to have non-IP flows - I have seen reports with IP version "0"

No worries! It’s important, I understand the needs. However, it looks like, that field as implemented is not useful. I will remove it and check, if there is another option.

phaag commented 5 months ago

Thanks! - The flow aggregation is fixed. It triggered a bug, if no ipv4 or ipv6 block was present. Fixed now in master repo.

phaag commented 5 months ago

@jbemmel - as you opened the issue, could you confirm, if the current master would work for you?

jbemmel commented 5 months ago

@jbemmel - as you opened the issue, could you confirm, if the current master would work for you?

image image

Not directly related to the original FR, but the 'json' output still selects the wrong addresses to include in its output. Not sure what fields are present in the original packet - I did not do a packet capture - but type 'ICMP6' should trigger selection of ipv6 addresses

phaag commented 5 months ago

Please make sure, to skip old files. Since the merge, there is no longer an extension 39. Please test with new collected data and delete the old one.

phaag commented 5 months ago

The json output is fixed. You see now all available extensions.

phaag commented 5 months ago

Seems, that the implementation works. I therefore close this ticket. Feel free to re-open it in case of problems.