utoni / nDPId

Tiny nDPI based deep packet inspection daemons / toolkit.
GNU General Public License v3.0
67 stars 15 forks source link

On the meaning of several Flow-Event JSON attributes #13

Closed verzulli closed 1 year ago

verzulli commented 2 years ago

While analyzing flow-event data received by nDPId I'm having some trouble understanding the gory detail of some JSON attributes.

I'm going to raise some questions, here below, with a temptative answer I guessed from my analysis. Please, check them as well.

Note that I'm also available to put them on a specific FAQ page, once their answer will be defined.


Q1: Where I can get the detailed structure of the Flow-Event JSONs I'll be sent by nDPId?

A1: A related schema file can be retrieved in the schema subfolder

--

Q2: What about flow_first_seen, flow_src_last_pkt_time and flow_dst_last_pkt_time timestamp attributes? Which timestamp they refer to?

A2: flow_first_seen is the timestamp registered by nDPId when it saw the very first packet originating this new flow. On the contrary, flow_src_last_pkt_time and flow_dst_last_pkt_time timestamps, are continuously updated by nDPId as soon as it saw packets related to that flow. Based on the direction of such a packet (a request from SRC to DST, or a reply from DST to SRC), nDPId will update the flow_src_last_pkt_time or the flow_dst_last_pkt_time, respectively

--

Q3: What about the flow_idle_time time attribute? Which time it refers to?

A3: ...to be filled...

--

Q4: What about the thread_ts_usec timestamp attribute? Which timestamp it refers to?

A4: ...to be filled...

--

Q5: What about the midstream attribute? It seems its value is always 0...

A5: ...to be filled...

--

Q6: as for update events, While examining a set of flow-events related to the same flow, I noticed:

Can you explain when update events are issued and confirm that the thread_ts_usec can be considered as the timestamp associated by nDPId to those events?

This is going to be an important question, expecially in terms of inter-arrival-time analysis of those update events.

--

utoni commented 2 years ago

Q3: What about the flow_idle_time time attribute? Which time it refers to?

The idle time is calculated internal and may vary depending on the l3/l4 protocol. It refers to the maximum time before a flow will time out if no packets related to that flow were processed. In that case nDPId will send an flow-idle event. Please keep in mind that a flow can still end (flow-end) or time out (flow-idle) earlier, but not later.

The following statement must be true at any time: max(flow_src_last_pkt_time, flow_dst_last_pkt_time) + flow_idle_time <= thread_ts_usec

Q4: What about the thread_ts_usec timestamp attribute? Which timestamp it refers to?

It refers to the timestamp coming from libpcap. But instead of the global_ts_usec, which should be always equal for every thread, thread_ts_usec refers to the last timestamp for a packet which was processed by this thread (after packet distribution).

Q5: What about the midstream attribute? It seems its value is always 0...

It is a good sign if it is always 0, because everything else would mean that you're missing SYN/ACK/SYN-ACK TCP packets.

Q6: as for update events, While examining a set of flow-events related to the same flow, I noticed:

* an initial `new` event (as expected)

* a following `not-detected` event (as it could be possible)

* lots of following `update` events (in the order of tens...), that seems to be sent at mostly-regular interval (in the order of tens of seconds).

Can you explain when update events are issued and confirm that the thread_ts_usec can be considered as the timestamp associated by nDPId to those events?

This is going to be an important question, expecially in terms of inter-arrival-time analysis of those update events.

It is safe for update events to use thread_ts_usec. The update algorithm needs some fine tuning, because at the current state a flow-update is issued for every flow that did not time out (or end in case of TCP) after flow_idle_time / 4 for non-TCP flows and flow_idle_time for TCP flows. So a flow-update informs your application that a certain flow still sends packets from SRC->DST or DST->SRC.

utoni commented 1 year ago

Please reopen if there is need for clarification.