ocsf / ocsf-schema

OCSF Schema
Apache License 2.0
631 stars 135 forks source link

TCP Flags should be directional #1033

Open arunj18 opened 7 months ago

arunj18 commented 7 months ago

I was looking at the data dictionary on https://schema.ocsf.io/1.1.0/dictionary?extensions= and noticed that the tcp_flags are set on the Network Connection Information object with no directionality specified.

My understanding is TCP flags are directional and should exist on the Network Traffic object since it is directional in nature.

See https://www.site24x7.com/learn/linux/tcp-flags.html for an example that TCP flags are set in each packet. In systems that log network traffic, TCP flags that are received by the system and sent by the system will be different.

pagbabian-splunk commented 6 months ago

The object network_connection_info does have a direction_id to go along with the tcp_flags attribute. Do you want to add tcp_flags with direction to network_traffic as well? These objects often are paired together, for example all the Network Activity classes have both.

arunj18 commented 6 months ago

Hmm, looking at it more it looks like there are inconsistencies between the Network Traffic and Network Connection Information. I see that Network Traffic accounts for bidirectional flow of traffic via fields like bytes_in, bytes_out etc. but each Network Connection Info accounts for unidirectional flow of traffic.

How do we model bidirectional flow of traffic if we are to use the Network Connection Info?

Adding a direction_id to Network Traffic would result in the existing fields being invalid in the modeling (since they have bytes_in, bytes_out etc.).

pagbabian-splunk commented 6 months ago

We discussed in the Network meeting today. Perhaps you can supply us with an example of a log you have in mind? The current objects used with Network Activity classes seem to be as designed for the use cases we have so far.

mlartz commented 6 months ago

Unfortunately our data is internal, but is similar to an IPFIX biflow using Direction by Perimeter: https://datatracker.ietf.org/doc/html/rfc5103. IPFIX biflows provide directional containers (fwd/rev) within the flow to hold the various packet metadata. In this case, we have the tcpControlBits ((https://www.rfc-editor.org/rfc/rfc9565.html) for fwd and rev directions in our IPFIX biflow. How would we represent these TCP flags in OCSF? In our scenario, we were thinking they likely belonged with the other directional fields (bytes, packets, etc) in the Network Traffic object.

pagbabian-splunk commented 6 months ago

You might be able to dummy out anything considered sensitive but we have been discussing this on the Network weekly call and without an example (I looked online) that counters it, it should be mapped like VPC Flow Logs which so far have used the Network Activity class. The direction is indicated within Network Connection Information as are the flags (hence the flags would be for that direction). We should have labeled the object Network Information as to not imply it must be a connection (since IPFIX is connectionless). Otherwise, a non-connection based class could be considered, but I wouldn't want to have implementations that have mapped flow logs to Network Activity to feel they should remap.

mlartz commented 6 months ago

The issue is that VPC Flow Logs are (roughly) unidirectional flow, not biflow. I'll see if we can sanitize our data and provide an example.

mlartz commented 5 months ago

Here is a heavily redacted example of our data represented in JSON. We are collecting "biflows" from AWS EC2 instances, or roughly 1 minute aggregations of data keyed by the 5-tuple. You can see in the flow section that we have the local/remote IPv4/port (relative to the EC2 instance) and protocol for the 5-tuple, as well as a set of in/out (also relative to the EC2 instance) metadata. For this case, we are specifically trying to model the tcp_flags_in and tcp_flags_out values, which are a bitwise ANDing of the various flags seen in this direction for this "minute". This is very analogous to IPFIX biflows, as mentioned above, with the minor difference that we use in/out and IPFIX uses fwd/rev: https://datatracker.ietf.org/doc/html/rfc5103

{
  "metadata": {
    "account_id": "012345678901",
    "instance_id": "i-0aabbccddeeff001122",
    "region": "us-east-1",
    "start_time": 1717182549,
    "end_time": 1717182580
  },
  "flow": {
    "local_ip": 167810349,
    "local_port": 32846,
    "remote_ip": 174465028,
    "remote_port": 443,
    "protocol": 6,

    "bytes_in": 520,
    "packets_in": 10,
    "tcp_flags_in": 0,

    "bytes_out": 2715,
    "packets_out": 3,
    "tcp_flags_out": 0
  }
}
arunj18 commented 4 months ago

@pagbabian-splunk Were there any further discussions on the TCP flags? @mlartz provided a sample record above