netsampler / goflow2

High performance sFlow/IPFIX/NetFlow Collector
BSD 3-Clause "New" or "Revised" License
423 stars 99 forks source link

How to identify traffic direction in sFlow information? #317

Open meguoe opened 1 month ago

meguoe commented 1 month ago

How to identify traffic direction in sFlow information?

meguoe commented 1 month ago

elastiflow This is a configuration fragment for identifying traffic direction in Elasticflow project

`

if [sflow][source_id_type] == 0 { if [sflow][source_id_index] == [flow][input_snmp] { mutate { add_field => { "[flow][direction]" => "ingress" } } } else if [sflow][source_id_index] == [flow][output_snmp] { mutate { add_field => { "[flow][direction]" => "egress" } } } else { mutate { add_field => { "[flow][direction]" => "undetermined" } } } } `

lspgn commented 1 month ago

Hello, I am not familiar with how this other tool works.

My advice would be to either:

meguoe commented 1 month ago

I noticed that the -mapping parameter can specify fields, and the example file includes flow_direction. However, I'm not quite sure how to use this specifically. How can I add this field in sFlow?

lspgn commented 1 month ago

Hello, The example file includes an example for NetFlow/IPFIX which are different protocols. To add it to sFlow, you need an out-of-band mechanism that enrich the samples with information. This is beyond the scope of GoFlow2. You will need to develop custom tooling. You can also look at akvorado.

meguoe commented 1 month ago
image

"Hi, @lspgn, there are three fields in the sFlow sample, namely Source ID index, Input interface value, and Output interface value. To distinguish the direction of traffic, matching is done based on Source ID index, Input interface value, and Output interface value. I wonder if it's possible to add a 'direction' field. I believe this would be meaningful, especially in traffic analysis scenarios."

lspgn commented 1 month ago

Hi @meguoe, Thank you for the screenshot.

I had a look and it's part of the expanded flow sample (not in the regular flow sample). It reduces the amount of user coverage. The doc is a bit unclear to me... I posted some snippets from the sFlow spec below mentioning the source:

/* sFlowDataSource encoded as follows:
     The most significant byte of the source_id is used to indicate the type
     of sFlowDataSource:
        0 = ifIndex
        1 = smonVlanDataSource
        2 = entPhysicalEntry
     The lower three bytes contain the relevant index value. */
/* Header information for sFlow version 5 datagrams

   The sub-agent field is used when an sFlow agent is implemented on a
   distributed architecture and where it is impractical to bring the
   samples to a single point for transmission.

...

   Each sFlowDataSource must be associated with only one sub-agent. The
   association between sFlowDataSource and sub-agent must remain
   constant for the entire duration of an sFlow session. */
struct sflow_data_source_expanded {
   unsigned int source_id_type;   /* sFlowDataSource type */
   unsigned int source_id_index;  /* sFlowDataSource index */
}

struct flow_sample_expanded {
...
   sflow_data_source_expanded source_id; /* sFlowDataSource */

The MIB contains more details:

      SFlowDataSource ::= TEXTUAL-CONVENTION
              STATUS      current
              DESCRIPTION
                "Identifies a source of sFlow data.

                The following data source types are currently defined:

                - ifIndex.<I>
                SFlowDataSources of this traditional form are called
                'port-based'. Ideally the sampling entity will perform 
                sampling on all flows originating from or destined to 
                the specified interface. However, if the switch architecture 
                only allows input or output sampling then the sampling agent 
                is permitted to only sample input flows input or output flows. 
                Each packet must only be considered once for sampling, 
                irrespective of the number of ports it will be forwarded to.
                Note: Port 0 is used to indicate that all ports on the device
                      are represented by a single data source.
                      - sFlowFsPacketSamplingRate applies to all ports on the
                        device capable of packet sampling.

It seems to be interpreted as the direction in the case of Ethernet but I'm curious how common it is in the wild

What's the hardware vendor?

meguoe commented 1 month ago

hi @lspgn , My hardware vendor is Huawei, and the models are CE8850 and CE6857

lspgn commented 1 month ago

Thank you I'd like to wait for now to see if other users have this requirement as well. Especially as I'm not sure if it's consistently interpreted like this.