Proposal - support of extended sflow format

dbardbar commented 1 year ago

There are several people that asked for extending vflow support of sflow that it'll parse extended sample and counter packets (type 3 and 4). Specifically, @KrunalT, @yangyu66 in issue #154, and @ttrading in issue #125

At my place of work we also need to support this. Some work was done internally at my place of work, but it is not fully polished, and while working on the code, saw several issues with the parsing done today, which need to be solved before adding support for type 3 and 4.

Issue 1 is that in compact format of sflow sample, the source id is today read as the first 8 bits, which is wrong. This is explained in more detail in issue #178 In extended format, after issue #178 is fixed, then the code change is pretty straight-forward. In compact format read the first 1 byte and cast to 4 bytes as the source id format, and read the remaining 3 bytes and cast to 32-bit as the source id value. For extended format just read 4 bytes and 4 bytes, as the source id format and source id value. This applies to both sflow expanded sample and sflow expanded counter.

Issue 2 is that the output interface in sflow sample is read as the whole 32 bits, while the standard says that the first 2 bits are to be parsed as type, and the rest 30 bits are the value. The value should be interpreted as index only if the first 2 bits are b00. For b01 it represents some drop reason, and for b10 it represents the number of interfaces the packet was sent to. Sending the whole 32 bits as output interface index is wrong, but at least gives the opportunity for the reader of vflow's output the ability to parse the value, as it knows vflow only supports non-extended format. For this to work properly when we add support for extended format, IMHO the most reasonable thing is to change vflow JSON output, so that we no longer give just "Output" as the output interface index, but split it OutputInterfaceType and OutputInterfaceValue. The OutputInterfaceType will be either 2 or 32 bits from the packet (depending on compact/expanded format) and the OutputInterfaceValue will be the next 30/32 bits from the packet (depending on compact/expanded format). This is a breaking change, but I think it is reasonable.

The input interface is also split as 4+4 bytes (compared to 2/30 bits in compact format), but the standard says that the first 4 bytes (or first 2 bits) are always 0, and the remaining 4 bytes always represent the input interface index. This requires minor adjustment of the code to just skip over the first 4 bytes.

Input from the community is appreciated, both regarding the proposed code / JSON format change, and how you today handle the problematic behaviour of vflow for compact formats for the source id and the output interface.

akshah commented 1 year ago

Hey @dbardbar , can you please link the documentation/rfc that describe the parsing of these 32bits

dbardbar commented 1 year ago

@akshah - following our conversation - references for encoding:

For the expanded format - AFAIK this is not covered in any RFC, but is explained in the the sflow version 5 standard - https://sflow.org/sflow_version_5.txt

1) SourceID encoding in expanded format as 32/32 bit - see page 31 (sflow_data_source_expanded) 2) Interface encoding in expanded format as 32/32 bit - see page 31 (interface_expanded)

For the compact format, this is covered in RFC 3176 - https://www.rfc-editor.org/rfc/rfc3176

1) SourceID encoding as 8/24 bit composite field - see page 20 (struct flow_sample) and page 24 (struct counters_sample).

2) Output interface encoding as 2/30 bit composite field - see page 20 (search for "unsigned int output")

3) Input interface encoding as 2/30 bit composite field, but actually the top 2 bits are always 0 - that's not in the RFC, but is explained in the sflow5 standard (https://sflow.org/sflow_version_5.txt) , page 27, at the bottom, where it says "Note: Formats 1 & 2"

dbardbar commented 1 year ago

Also, attaching sflow pcap in expanded and compact format. Use wireshark to see how it parses the fields, especially look at the hex output, and see that it shows which bytes were used to obtain each element, once you click it. pcaps.zip

Edgio / vflow

Proposal - support of extended sflow format #179