CERT-Polska / malduck

:duck: Malduck is your ducky companion in malware analysis journeys
GNU General Public License v3.0
318 stars 30 forks source link

Deduplicate procmems by sha256 hash #119

Closed msm-cert closed 6 months ago

msm-cert commented 7 months ago

This solves our "problems" with binaries that are submitted as copies of themselves, like b = open("x.exe").read(); b = b * 100.

bb4c7d48773b21c62885d0206c9414176c254e6104b4a0ffe730aa570b424948
21a479ce141d62b5920b3f76ece6d2b4a58c7f25afc8751e161a0fdf44a0197f
f5c9c598101a49a5c60f174fe7e7946c3f73c7c51b39ae5120560494d80db168

This is not an elegant fix for many reasons:

It will hopefully stop us from OOMing, but it's not a critical fix.

yankovs commented 7 months ago

Hey! :) Just stumbled upon this PR and wanted to share that recently I've been experimenting with dumps from emulation and came across (what I assume is) this behavior, same samples only difference being imagebase. I'll be happy to share hashes of such samples, if you need

psrok1 commented 6 months ago

It looks like it's not really b*100 but binary were not correctly carved and m was pointing 100 times at the beginning of the buffer. I try to fix this in #122.