Closed fisher85 closed 2 years ago
yes, I also realized this problem too. I even did not understand what subflow is. But after analyzing the CSV file from IDS dataset (which produced by this tool), I found out that the attribute "subflow" was very useful, that is why I had to keep this logic for my code ( here https://github.com/ltkhang/sdn-ids-ddos-defense/blob/master/basic_flow.py) to produce exactly the same as original training data.
This problem was fixed in commit 5df1a62 but without any notification.
Yes, the bug has been fixed. This error caused the CICIDS2017 dataset to contain incorrectly labeled data, at least in the subflows
column.
CIC datasets are among the most cited in the world, with hundreds of studies using the CICIDS2017 dataset (and other) to train machine learning models. In those studies in which the subflows
feature is included in the feature space, incorrect results have been obtained.
@ahlashkari I propose to note the presence of an error and the date of correction in the description of the datasets.
I'm looking at the
detectUpdateSubflows(BasicPacketInfo packet)
method and I can't understand the logic behind this condition:if( (packet.getTimeStamp() - (sfLastPacketTS)/(double)1000000) > 1.0)
It may be correct to move the closing parenthesis:
if( (packet.getTimeStamp() - (sfLastPacketTS))/(double)1000000 > 1.0)