Closed Sankalp-CS21MTECH12010 closed 1 year ago
@harshith-kotha5084 Write all your updates here. Thanks!
Trained two models(decision tree, only on packet features) on the 'Tuesday PCAP':
1) To predict Benign or malicious.
2) To predict short or long flows.
these were the results obtained.
@harshith-kotha5084 These are packet feature model accuracies. We now need to check the accuracy using (1) Only flow features for all flows (2) Flow features for long flows and packet features for short flows And compare them and see if there is a significant gap.
@harshith-kotha5084 @praveenabt
AdaFlow ML model performance:
Tuesday Trace -- maintaining all flow features state for all flows: Accuracy = 0.9998688949385734 Recall = 0.9908519153802172 Precision = 0.9994232987312572 FPR = 7.817508091120875e-06 FNR = 0.009148084619782733
NetBeacon ML model performance: Maintaining all flow features state for only long flows and only packet features for short flows: Accuracy = 0.9899390399050083 Recall = 0.8897318881916715 Precision = 0.8671264367816092 FPR = 0.0113140545822981 FNR = 0.1826811180832858
Accuracies in both cases are good, but I see about ~10-15% drop in recall and precision. In contrast to Covert Channels and P2P fingerprinting where it was about ~2-3% this is higher.
Accuracy still seems to be high due to the majority of flows being benign here.
Observations:
Verify if differentiating between short flow and long flows changes the metric significantly in AdaFlow, for CIC-IDS dataset.