HolmesProcessing / gsoc_relationship

WIP: this repository is for organizing the gsoc relationship project
Apache License 2.0
12 stars 7 forks source link

Automatically learn the relationship of malware samples at scale using deep learning technique #22

Open feuerchop opened 6 years ago

feuerchop commented 6 years ago

The Holmes Project has recently acquired a large dataset of labeled malware artifacts, which can be used for deep learning based malware relationship mining. This labeled dataset of over 20k samples should be a big help for students attempting to do Malware Relationship Detection. Besides, as a result of the previous GSoC’18, we also have an efficient data model for the malware relationships. New potential GSoC students can immediately start with the machine learning part without concerns for optimal data modeling and distributed storage. As a follow-up project, students are expected to come up with decent learning model to detect malware relationship and create better visualisation frontend.

pabitralenka commented 6 years ago

Hi @feuerchop . I am a senior undergraduate at IIIT Bhubaneswar, India. I am interested to contribute to The Holmes Project. It would be great if you can let me know more about the dataset and a bit of concrete explanation on this issue. This will really help me to get a good start.

Thank you

ayushb666 commented 6 years ago

Hi @feuerchop , I am a master student at University of Minnesota, USA. I am interested to contribute to The Holmes project as part of GSOC 2018. Can you please let me know more about the dataset and what things needed to be done.

Thanks

cli0 commented 6 years ago

Hey @ayushb666 and @pabitralenka , I would suggest you guys join the Honeynet GSoC Slack chatroom https://gsoc-slack.honeynet.org/ and the #holmesprocessing channel (the project's channel). This way we can answer all your questions somewhere all interested students can easily access. Virtually all students so far have had this same question. Fyi, the mentor is on slack too :)

ritwikagarwal commented 6 years ago

I can't see any dataset or any mockup of the data-model provided by the Holmes project. Anyway I have written my proposal keeping in view the standard format of data used for machine-learning projects and assignments.Can you have a look at it.

shelldragoon1104 commented 5 years ago

Hi @feuerchop . I am a second year undergraduate at LNMIIT, India. I am interested to contribute to The Holmes Project as part of GSOC 2019. Can you let me know more about the dataset and a bit of concrete explanation on this issue. This will really help me to get a good start.

Thank you