frenky-strasak / My_bachelor_thesis

My bachelor thesis is about detecting malware by machine learning.
6 stars 6 forks source link

Which datasets contains malicious HTTPS traffic? #1

Open hawkinsw opened 5 years ago

hawkinsw commented 5 years ago

I just finished reading your thesis -- wonderful! Great job!

I just have a quick question, if you don't mind:

How did you determine which of the CTU-13 and MCFP captures are from malware that communicate via HTTPS? Did you simply download the entire set of captures and assume that any HTTPS connections were spawned by malware communicating over HTTPS? That seems like a reasonable method. I am just curious!

Again, great research and thank you for making your code available so that we can recreate your findings!

Will

frenky-strasak commented 5 years ago

Thanks,

Good question. Each capture from CTU-13 and MCFP dataset has infected IP. So I read the capture flow by flow and if a source IP == infected ip of the flow, then such flow is malicious flow.