ahlashkari / CICFlowMeter

CICFlowmeter-V4.0 (formerly known as ISCXFlowMeter) is an Ethernet traffic Bi-flow generator and analyzer for anomaly detection that has been used in many Cybersecurity datsets such as Android Adware-General Malware dataset (CICAAGM2017), IPS/IDS dataset (CICIDS2017), Android Malware dataset (CICAndMal2017) and Distributed Denial of Service (CICDDoS2019).
Other
545 stars 270 forks source link

CICFlowMeter doesn't count subflows correctly #111

Closed fisher85 closed 2 years ago

fisher85 commented 3 years ago

I'm looking at the detectUpdateSubflows(BasicPacketInfo packet) method and I can't understand the logic behind this condition:

if( (packet.getTimeStamp() - (sfLastPacketTS)/(double)1000000) > 1.0)

It may be correct to move the closing parenthesis:

if( (packet.getTimeStamp() - (sfLastPacketTS))/(double)1000000 > 1.0)

ltkhang commented 3 years ago

yes, I also realized this problem too. I even did not understand what subflow is. But after analyzing the CSV file from IDS dataset (which produced by this tool), I found out that the attribute "subflow" was very useful, that is why I had to keep this logic for my code ( here https://github.com/ltkhang/sdn-ids-ddos-defense/blob/master/basic_flow.py) to produce exactly the same as original training data.

mikonnikova commented 2 years ago

This problem was fixed in commit 5df1a62 but without any notification.

fisher85 commented 2 years ago

Yes, the bug has been fixed. This error caused the CICIDS2017 dataset to contain incorrectly labeled data, at least in the subflows column.

CIC datasets are among the most cited in the world, with hundreds of studies using the CICIDS2017 dataset (and other) to train machine learning models. In those studies in which the subflows feature is included in the feature space, incorrect results have been obtained.

@ahlashkari I propose to note the presence of an error and the date of correction in the description of the datasets.