hillu / go-yara

Go bindings for YARA
BSD 2-Clause "Simplified" License
358 stars 113 forks source link

ScanFile() increased memory by 300MB after scanning a large file and did not release #156

Open xlango opened 3 months ago

xlango commented 3 months ago

s, err := yara.NewScanner(yaraRules) if err != nil { return matchRuleTypes, err }

err = s.SetCallback(&matchRuleTypes).ScanFile(path)
if err != nil {
    return matchRuleTypes, err
}

I scanned a dockerd binary file 80MB, after scanning only found that the process memory increased 300MB and has not been released, May I ask why?

xlango commented 3 months ago

I've found this to happen whenever elf modules are used regularly: import "elf" rule golang { meta: description = "test" strings: $s1 = "gccgo" fullword condition: (elf.type == elf.ET_EXEC or elf.type == elf.ET_DYN) and all of ($s*) }

hillu commented 2 months ago

If this go away if you don't import the elf module in your ruleset, I suggest that this is an issue in YARA itself. Is there anything specific about the file you are scanning or does the same memory leakage happen if you scan 300 MB zeroes? Can you share a file (or point me to a public file) that can be used to demonstrate the issue?

ozanh commented 2 months ago

Can you also share the Yara version you compiled with @xlango ?

xlango commented 2 months ago

I use ubuntu22.04 system, kernel version 5.19, yara version is compiled Yara-4.4.0 and Yara-4.3.2 both have this problem. The file I scanned is a binary file /usr/bin/dockerd version information: Docker version 20.10.21, build 20.10.21-0Ubuntu1 ~22.04.3

ozanh commented 2 months ago

I've tested your rule with dockerd binary separately and also all the files under /usr/bin on Ubuntu22.04 arm64 with our product that uses Yara 4.5.1 and go-yara@latest.

I didn't see any memory issue. Maybe, you didn't call scanner's Destroy method explicitly or it's about Yara 4.4 but I didn't see anything related to elf module in the release notes.

Please, try calling Destroy() and/or runtime.GC() after scanning to see if there is such a big leak.