networked-systems-iith / AdaFlow

AdaFlow: An Efficient In-Network Cache for Intrusion Detection using Programmable Data Planes
MIT License
0 stars 1 forks source link

Short Flow/Long Flow analysis of AdaFlow #14

Closed Sankalp-CS21MTECH12010 closed 1 year ago

Sankalp-CS21MTECH12010 commented 1 year ago

Verify if differentiating between short flow and long flows changes the metric significantly in AdaFlow, for CIC-IDS dataset.

Sankalp-CS21MTECH12010 commented 1 year ago

@harshith-kotha5084 Write all your updates here. Thanks!

harshith-kotha5084 commented 1 year ago

Trained two models(decision tree, only on packet features) on the 'Tuesday PCAP':

1) To predict Benign or malicious. Screenshot 2023-06-14 092517

2) To predict short or long flows. Screenshot 2023-06-14 092540

these were the results obtained.

Sankalp-CS21MTECH12010 commented 1 year ago

@harshith-kotha5084 These are packet feature model accuracies. We now need to check the accuracy using (1) Only flow features for all flows (2) Flow features for long flows and packet features for short flows And compare them and see if there is a significant gap.

Sankalp-CS21MTECH12010 commented 1 year ago

@harshith-kotha5084 @praveenabt

AdaFlow ML model performance:

Tuesday Trace -- maintaining all flow features state for all flows: Accuracy = 0.9998688949385734 Recall = 0.9908519153802172 Precision = 0.9994232987312572 FPR = 7.817508091120875e-06 FNR = 0.009148084619782733

NetBeacon ML model performance: Maintaining all flow features state for only long flows and only packet features for short flows: Accuracy = 0.9899390399050083 Recall = 0.8897318881916715 Precision = 0.8671264367816092 FPR = 0.0113140545822981 FNR = 0.1826811180832858

Accuracies in both cases are good, but I see about ~10-15% drop in recall and precision. In contrast to Covert Channels and P2P fingerprinting where it was about ~2-3% this is higher.

Accuracy still seems to be high due to the majority of flows being benign here.

praveenabt commented 1 year ago

Observations: