stratosphereips / StratosphereLinuxIPS

Slips, a free software behavioral Python intrusion prevention system (IDS/IPS) that uses machine learning to detect malicious behaviors in the network traffic. Stratosphere Laboratory, AIC, FEL, CVUT in Prague.
Other
702 stars 176 forks source link

ML train format issue #812

Closed whale-withme closed 3 months ago

whale-withme commented 3 months ago

Hi! As I know, most meachine learning model do not use .pcap file to train. So I found most dataset use like .csv. Could you tell me how to tain with datasets like KDD99?

eldraco commented 3 months ago

Hi @whale-withme. To train any model from pcap you need to convert the pcap to some other format. Whatever it is. CSV, TSV or what you personally want. In Slips, to train the ML models in some dataset, you need to be sure the dataset is in the format Slips can read. Now slips can have these inputs https://stratospherelinuxips.readthedocs.io/en/develop/usage.html#reading-the-input

If your dataset is in any of those formats, you can use Slips.

To train the mlflow module follow https://stratospherelinuxips.readthedocs.io/en/develop/training.html To train the ml CC detection module follow this guide (currently only in the fix-rnn-detection branch): https://github.com/stratosphereips/StratosphereLinuxIPS/tree/fix-rnn-detection/modules/rnn_cc_detection#training-of-models-to-do-command-and-control-detection

whale-withme commented 3 months ago

Thanks! Could you please tell me the difference between mlcc detection module and mlflow module? I thought there was only mlflow module before.

eldraco commented 3 months ago

the mlflow module detects malicious flows by looking at the whole flow information using Zeek data. It is a regular NN. The RNN CC detection module detects command and control in more complex ways. Read the documentation please here: https://stratospherelinuxips.readthedocs.io/en/develop/detection_modules.html

whale-withme commented 3 months ago

Thanks! I will read docs carefully!