Deep Learning - IDS
Towards Developing a Network Intrusion Detection System using Deep Learning Techniques
Introduction
In this project, we aim to explore the capabilities of various deep-learning frameworks in detecting
and classifying network intursion traffic with an eye towards designing a ML-based intrusion detection system.
Dataset
- Downloaded from: https://www.unb.ca/cic/datasets/ids-2018.html
- contains: 7 csv preprocessed and labelled files, top feature selected files, original traffic data in pcap format and logs
- used csv preprocessed and labelled files for this research project
Data Cleanup
- dropped rows with Infinitiy values
- some files had repeated headers; dropped those
- converted timestamp value that was date time format: 15-2-2018 to UNIX epoch since 1/1/1970
- separated data based on attack types for each data file
- ~20K rows were removed as a part of data cleanup
- see data_cleanup.py script for this phase
- # Samples in table below are total samples left in each dataset after dropping # Dropped rows/samples
Dataset Summary
File Name |
Traffic Type |
# Samples |
# Dropped |
02-14-2018.csv |
Benign |
663,808 |
3818 |
|
FTP-BruteForce |
193,354 |
6 |
|
SSH-Bruteforce |
187,589 |
0 |
02-15-2018.csv |
Benign |
988,050 |
8027 |
|
DOS-GoldenEye |
41,508 |
0 |
|
DOS-Slowloris |
10,990 |
0 |
02-16-2018.csv |
Benign |
446,772 |
0 |
|
Dos-SlowHTTPTest |
139,890 |
0 |
|
DoS-Hulk |
461,912 |
0 |
02-22-2018.csv |
Benign |
1,042,603 |
5610 |
|
BruteForce-Web |
249 |
0 |
|
BruteForce-XSS |
79 |
0 |
|
SQL-Injection |
34 |
0 |
02-23-2018.csv |
Benign |
1,042,301 |
5708 |
|
BruteForce-Web |
362 |
0 |
|
BruteForce-XSS |
151 |
0 |
|
SQL-Injection |
53 |
0 |
03-01-2018.csv |
Benign |
235,778 |
2259 |
|
Infiltration |
92,403 |
660 |
03-02-2018.csv |
Benign |
758,334 |
4050 |
|
BotAttack |
286,191 |
0 |
Traffic Type |
# Samples |
Benign |
5,177,646 |
FTP-BruteForce |
193,354 |
SSH-BruteForce |
187,589 |
DOS-GoldenEye |
41,508 |
Dos-Slowloris |
10,990 |
Dos-SlowHTTPTest |
139,890 |
Dos-Hulk |
461,912 |
BruteForce-Web |
611 |
BruteForce-XSS |
230 |
SQL-Injection |
87 |
Infiltration |
92,403 |
BotAttack |
286,191 |
Total Attack |
1,414,765 |
Deep Learning Frameworks
- perfomance results using various deep learning frameworks are compared
- 10-fold cross-validation techniques was used to validate the model
FastAI
Keras
Experiment Results
Using Salamander.ai
Dataset |
Framework |
Accuracy (%) |
Std-Dev |
GPU Time (~mins) |
02-14-2018 |
FastAI |
99.85 |
0.07 |
* |
|
Keras-TensorFlow |
98.80 |
* |
* |
|
Keras-Theano |
* |
* |
* |
02-15-2018 |
FastAI |
99.98 |
0.01 |
25 |
|
Keras-Tensorfflow |
99.32 |
* |
* |
|
Keras-Theano |
* |
* |
* |
02-16-2018 |
FastAI |
100.00 |
0.00 |
16 |
|
Keras-TensorFlow |
99.84 |
* |
* |
|
Keras-Theano |
* |
* |
* |
02-22-2018 |
FastAI |
99.87 |
0.15 |
110 |
|
Keras-TensorFlow |
99.97 |
* |
* |
|
Keras-Theano |
* |
* |
* |
02-23-2018 |
FastAI |
99.92 |
0.00 |
120 |
|
Keras-TensorFlow |
99.94 |
* |
* |
|
Keras-Theano |
* |
* |
* |
03-01-2018 |
FastAI |
87.00 |
0.00 |
5 |
|
Keras-TensorFlow |
72.20 |
* |
* |
|
Keras-Theano |
* |
* |
* |
03-02-2018 |
FastAI |
99.97 |
.01 |
75 |
|
Keras-TensorFlow |
98.12 |
* |
* |
|
Keras-Theano |
* |
* |
* |
=== |
=== |
=== |
=== |
=== |
Multiclass |
Keras-TensorFlow |
94.73 |
* |
* |
|
Keras-Theano |
* |
* |
* |
|
FastAI |
* |
* |
* |
Binaryclass |
Keras-TensorFlow |
94.40 |
* |
* |
|
Keras-Theano |
* |
* |
* |
|
FastAI |
* |
* |
* |
FastAI Results
Summary Results
Data File |
Accuracy |
Loss |
02-14-2018.csv |
99.99% |
0.00212 |
02-15-2018.csv |
99.86% |
0.02500 |
02-16-2018.csv |
99.97% |
324160 |
02-22-2018.csv |
99.97% |
0.00221 |
02-23-2018.csv |
99.82% |
0.06295 |
03-01-2018.csv |
87.14% |
0.37611 |
03-02-2018.csv |
99.72% |
0.85127 |
IDS-2018-binaryclass.csv* |
98.68% |
0.37692 |
IDS-2018-multiclass.csv* |
98.31% |
7.06169 |
* Trained on VMgpu
Confusion Matrices
02-14-2018 |
02-15-2018 |
02-16-2018 |
|
|
|
02-22-2018 |
02-23-2018 |
03-01-2018 |
|
|
|
03-02-2018 |
IDS-2018-binaryclass |
IDS-2018-multiclass |
|
|
|
Attack Sample Distribution and Detection Accuracy
Data File |
% of Attack Samples |
% Attacks Flagged Correctly |
% Benign Flagged Incorrectly |
02-14-2018 |
36.46 |
100.00 |
0.00* |
02-15-2018 |
5.04 |
99.85 |
0.00* |
02-16-2018 |
57.39 |
100.00 |
0.00* |
02-22-2018 |
0.00* |
0.02 |
0.00 |
02-23-2018 |
0.00* |
61.61 |
0.00* |
03-01-2018 |
28.16 |
73.19 |
10.16 |
03-02-2018 |
27.40 |
99.85 |
0.00* |
Binary-Class |
21.50 |
94.60 |
0.21 |
Multi-Class |
21.50 |
93.9 |
0.48 |
* Small, non-zero values
Using VMgpu
Dataset |
Framework |
Accuracy (%) |
Std-Dev |
GPU Time (~mins) |
02-14-2018 |
FastAI |
99.54 |
0.32 |
100.36 |
|
Keras-TensorFlow |
99.14 |
* |
100.29 |
|
Keras-Theano |
98.58 |
* |
* |
02-15-2018 |
FastAI |
99.98 |
0.01 |
103.16 |
|
Keras-TensorFlow |
99.33 |
* |
96.34 |
|
Keras-Theano |
99.17 |
* |
* |
02-16-2018 |
FastAI |
99.66 |
0.25 |
104.51 |
|
Keras-TensorFlow |
99.66 |
* |
99.59 |
|
Keras-Theano |
99.41 |
* |
* |
02-22-2018 |
FastAI |
99.90 |
0.09 |
102.83 |
|
Keras-TensorFlow |
99.97 |
* |
96.71 |
|
Keras-Theano |
99.97 |
* |
* |
02-23-2018 |
FastAI |
99.88 |
0.08 |
104.43 |
|
Keras-TensorFlow |
95.95 |
* |
100.79 |
|
Keras-Theano |
99.95 |
* |
* |
03-01-2018 |
FastAI |
86.47 |
0.78 |
33.23 |
|
Keras-TensorFlow |
72.16 |
* |
33.15 |
|
Keras-Theano |
72.04 |
* |
* |
03-02-2018 |
FastAI |
99.94 |
0.04 |
104.34 |
|
Keras-TensorFlow |
98.47 |
* |
105.95 |
|
Keras-Theano |
93.95 |
* |
* |
=== |
=== |
=== |
=== |
=== |
Multiclass |
FastAI |
98.60 |
0.16 |
683.12 |
|
Keras-TensorFlow |
92.09 |
* |
652.89 |
|
Keras-Theano |
* |
* |
* |
Binaryclass |
FastAI |
98.75 |
0.05 |
675.98 |
|
Keras-TensorFlow |
95.14 |
* |
632.36 |
|
Keras-Theano |
* |
* |
* |
fastai CPU vs GPU training time on vmGPU
Dataset |
Hardware |
Accuracy (%) |
Time (~mins) |
02-14-2018 |
|
|
|
|
CPU |
99.86 |
1193.84 |
|
GPU |
99.54 |
100.36 |
02-15-2018 |
|
|
|
|
CPU |
99.93 |
1299.55 |
|
GPU |
99.89 |
103.16 |
02-16-2018 |
|
|
|
|
CPU |
99.88 |
433.63 |
|
GPU |
99.66 |
104.51 |
02-22-2018 |
|
|
|
|
CPU |
99.83 |
3091.34 |
|
GPU |
99.90 |
102.83 |
02-23-2018 |
|
|
|
|
CPU |
99.83 |
1938.74 |
|
GPU |
99.88 |
104.43 |
03-01-2018 |
|
|
|
|
CPU |
85.39 |
80.07 |
|
GPU |
86.47 |
33.23 |
03-02-2018 |
|
|
|
|
CPU |
99.76 |
1503.18 |
|
GPU |
99.94 |
104.34 |
=== |
=== |
=== |
=== |
Multiclass |
|
|
|
|
CPU |
96.63 |
19361.95 |
|
GPU |
98.60 |
683.12 |
Binaryclass |
|
|
|
|
CPU |
96.66 |
19441.55 |
|
GPU |
98.75 |
632.36 |
References
- Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Portugal, January 2018