fmadio / pcap2json

High Speed PCAP to JSON conversion utility
Other
97 stars 21 forks source link

SHA1 remove MPLS tag from the calcuation #15

Closed fmadio closed 5 years ago

fmadio commented 5 years ago

Idea is to remove the MPLS tag from the flow calculation. however it means the JSON flow record will no longer be correct. e.g. multiple MPLS tags will be aggregated into a single flow record.

Is this ok?

navinsaven commented 5 years ago

Assuming the SHA1 hash of each flows is derived from the MAC, IP, Protocol and Port, would it not be a unique flow record?

Perhaps my concern about including the MPLS.0.Label might be unfounded. Let me try to clarify with the following scenario.

Let's say I want to look at the bandwidth utilization of an IP conversation between two host that are captured at multiple points in the network for the past 5 minutes. I include the hash as part of the workflow to de-duplicate the reported results. Normally, this should produce an accurate report. However, if there was a network convergence in the middle of the 5-minute window that resulted in the change of the MPLS.0.Label, would the report still be accurate even though the SHA1 hash has changed?

fmadio commented 5 years ago

If the MPLS.0.Label changed the SHA1 will be different and not contain the full picture.

To confirm is the inner or outer MPLS tag expected to change?

For reference the SHA1 is calculated on this set of data https://github.com/fmadio/pcap2json/blob/master/flow.c#L115-L131

navinsaven commented 5 years ago

It’s usually the outer label that changes (MPLS.0.Label).

navinsaven commented 5 years ago

So just to clarify, using MAC, IP, Protocol and Port should be sufficient to ensure the flow is unique, correct?

fmadio commented 5 years ago

yes correct. Ok will remove MPLS.0.Label in the hash calc for the next release

fmadio commented 5 years ago

MPLS.0 have been removed from the hash calculation. In reviewing the code, MPLS tag and traffic class are used in the hash. Is that ok, or should it be tag only and ignore the traffic class setting?

navinsaven commented 5 years ago

The MAC address ensures the flow is unique so it's also fine to remove traffic class from the hash calculation. Thanks.

fmadio commented 5 years ago

closing