threaTrace-detector / threaTrace

MIT License
79 stars 20 forks source link

Concern Regarding Mislabeling in one DARPA TC3 Dataset: "Fivedirections" #12

Open ahmed3amerai opened 4 months ago

ahmed3amerai commented 4 months ago

Hi,

I have a concern related to the labelling of one specific dataset of DARPA TC3, "Fivedirections".
There's an inconsistency in labelling nodes as malicious within this dataset. Specifically, about 80% of nodes marked as malicious do not have any events within the attack time frames outlined in the DARPA attack report.

For example, consider the "Thread" node with the UUID "B611669E-E8DD-449C-8157-FCBF9C9BE92E" which is labelled as malicious. It executed 95,471 actions between 2018-04-04 09:58:45 and 2018-04-04 14:27:22. However, the first attack on "Fivedirections" began on "2018-04-09".

Among the 762 nodes labelled as malicious, only 153 have associated events within the specified attack periods.

Do you have any clarifications or comments on this issue? Regards,