VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.33k stars 1.45k forks source link

feat: handle SIGBUS / SIGSEGV only when originating from YARA's memory blocks #2020

Closed secDre4mer closed 11 months ago

secDre4mer commented 11 months ago

YARA currently registers its own signal handler unless instructed otherwise with SCAN_FLAGS_NO_TRYCATCH. This is dangerous for several reasons:

This PR tries to improve this situation by adding several safety mechanisms when handling signals:

secDre4mer commented 11 months ago

This is related to https://github.com/hillu/go-yara/pull/137, which tries to deal with these issues with an original signal handler (installed by the Golang runtime) versus YARA's signal handler. Thanks to @plusvic for initially identifying the problems with a process wide signal handler, and to @hillu for discussing possible solutions.

vthib commented 11 months ago

Looks nice! Would have allowed us to catch https://github.com/VirusTotal/yara/pull/2016 much more easily.

Not entirely related to your changes, but I wonder if there is actually a reason to catch SIGSEGV?

From my understanding, this signal handler is to handle issues that can arise when reading from an mmap'ed memory region. But afaik, those issues can only trigger SIGBUS signals (for example, mmap'ing a file, shortening it, then reading past the new end of the file), and are not supposed to trigger SIGSEGV. Are there expected scenarios where reading from a mmap region raises a SIGSEGV signal?

secDre4mer commented 11 months ago

Looks nice! Would have allowed us to catch #2016 much more easily.

Not entirely related to your changes, but I wonder if there is actually a reason to catch SIGSEGV?

From my understanding, this signal handler is to handle issues that can arise when reading from an mmap'ed memory region. But afaik, those issues can only trigger SIGBUS signals (for example, mmap'ing a file, shortening it, then reading past the new end of the file), and are not supposed to trigger SIGSEGV. Are there expected scenarios where reading from a mmap region raises a SIGSEGV signal?

SIGSEGV is caught, I believe, because it's used on FreeBSD / OpenBSD instead of SIGBUS in the scenario you described, see issue #551. I agree though that catching SIGSEGV on other POSIX systems is unnecessary. Maybe we can add a differentiation by OS so we only catch the signal that is necessary. Similarly, I don't think catching EXCEPTION_ACCESS_VIOLATION (which is the SIGSEGV equivalent) on windows is necessary - EXCEPTION_IN_PAGE_ERROR (the SIGBUS equivalent) should be sufficient.