VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.31k stars 1.45k forks source link

Sporadic crash when scanning file with libyara on AIX #1199

Open Codehardt opened 4 years ago

Codehardt commented 4 years ago

Hi,

when scanning a file with libyara 3.11.0 on AIX 7.2 PPC64 POWER7 I sporadically get one of the following two assertion failures:

Assertion failed: __EX, file  exec.c, line 1331

or

Assertion failed: __EX, file  object.c, line 410

The assertions can be found here: https://github.com/VirusTotal/yara/blob/v3.11.0/libyara/exec.c#L1331 https://github.com/VirusTotal/yara/blob/v3.11.0/libyara/object.c#L410

It seems like it's completely random. I would say the first failure is raised every 30th time, the second error every 5th time.

I can't even say which rule triggers that assertion failure. The same ruleset is working fine on Windows, Linux and macOS.

Anyone here who knows what causes the above assertions to fail?

Libyara was compiled with:

CFLAGS="-maix64" OBJECT_MODE=64 ./configure --disable-shared --disable-magic --disable-cuckoo --enable-dotnet

OpenSSL Version 1.0.2r

ulimits:

core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) unlimited
pipe size            (512 bytes, -p) 64
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited

Available memory: svmon -G -O unit=MB

Unit: MB
--------------------------------------------------------------------------------------
               size       inuse        free         pin     virtual  available  mmode
memory      7648,00     1876,96     5771,04     1571,56     1641,38    5756,04     Ded
pg space     512,00        8,19

               work        pers        clnt       other
pin         1062,41           0        16,3      492,81
in use      1641,38           0      235,59

Kind regards,

Marcel

plusvic commented 4 years ago

This is really weird because the two assertions are very unrelated, so my guess is that some kind of memory corruption is happening. A memory dump could be helpful in this case, if you can get one it may be useful. There's a few thing we need to look at:

1) The value of the ip pointer (https://github.com/VirusTotal/yara/blob/v3.11.0/libyara/exec.c#L263). It is a stray pointer? Or it is pointing to actual VM code?

2) If ip seems to be pointing to actual VM code, what's the value of the opcode?

Codehardt commented 4 years ago

A memory dump could be helpful in this case, if you can get one it may be useful.

I recompiled YARA with the following two additional flags:

--enable-debug --disable-optimization

Then I used yarac to compile this free signature set: https://github.com/Neo23x0/signature-base:

yarac -d filename="" -d filepath="" -d extension="" -d filetype="" /opt/signature-base/yara/*.yar all.yac

And used all.yac to scan a file:

yara -C all.yac c57cb8bb5996c484a4001625217e02ec
Assertion failed: __EX, file  object.c, line 410
IOT/Abort trap (Speicherabzug geschrieben)

This created a core dump that I was able to open with gdb.

I haven't worked that much with gdb yet so please let me know if u need more output:

(gdb) backtrace
#0  0x09000000005c5884 in pthread_kill () from /usr/lib/libpthreads.a(shr_xpg5_64.o)
#1  0x09000000005c50c8 in _p_raise () from /usr/lib/libpthreads.a(shr_xpg5_64.o)
#2  0x090000000003faac in raise () from /usr/lib/libc.a(shr_64.o)
#3  0x090000000005e4dc in abort () from /usr/lib/libc.a(shr_64.o)
#4  0x09000000000f3334 in __assert_c99 () from /usr/lib/libc.a(shr_64.o)
#5  0x00000001000109d4 in yr_object_lookup_field (object=0x113443d30, field_name=0x113004149 "serial") at object.c:410
#6  0x000000010006d9c0 in yr_execute_code (context=0x110025690) at exec.c:576
#7  0x0000000100066288 in yr_scanner_scan_mem_blocks (scanner=0x110025690, iterator=0xffffffffffff350) at scanner.c:441
#8  0x0000000100066760 in yr_scanner_scan_mem (scanner=0x110025690, buffer=0xa00000000000000 <error: Cannot access memory at address 0xa00000000000000>, buffer_size=369880) at scanner.c:563
#9  0x0000000100066804 in yr_scanner_scan_file (scanner=0x110025690, filename=0xffffffffffffbf0 "c57cb8bb5996c484a4001625217e02ec") at scanner.c:577
#10 0x0000000100003528 in main (argc=2, argv=0xffffffffffffae0) at yara.c:1296

Now I navigated to that frame:

(gdb) frame
#0  0x09000000005c5884 in pthread_kill () from /usr/lib/libpthreads.a(shr_xpg5_64.o)
(gdb) up
#1  0x09000000005c50c8 in _p_raise () from /usr/lib/libpthreads.a(shr_xpg5_64.o)
(gdb) up
#2  0x090000000003faac in raise () from /usr/lib/libc.a(shr_64.o)
(gdb) up
#3  0x090000000005e4dc in abort () from /usr/lib/libc.a(shr_64.o)
(gdb) up
#4  0x09000000000f3334 in __assert_c99 () from /usr/lib/libc.a(shr_64.o)
(gdb) up
#5  0x00000001000109d4 in yr_object_lookup_field (object=0x113443d30, field_name=0x113004149 "serial") at object.c:410

And printed the object:

(gdb) p object
$1 = (YR_OBJECT *) 0x113443d30
(gdb) p *object
$4 = {canary = 0, type = 0 '\000', identifier = 0x0, parent = 0x0, data = 0x0, value = {i = 0, d = 0, p = 0x0, o = 0x0, s = 0x0, ss = 0x0, re = 0x0}}
(gdb) up
#6  0x000000010006d9c0 in yr_execute_code (context=0x110025690) at exec.c:576.
(gdb) p r1
$5 = {i = 4618206512, d = 2.2816971829795305e-314, p = 0x113443d30, o = 0x113443d30, s = 0x113443d30, ss = 0x113443d30, re = 0x113443d30}

The assertion error Assertion failed: __EX, file exec.c, line 1331 wasn't reproducible but maybe both assertions were triggered by same reason.

Thanks.

Codehardt commented 4 years ago

I was able to find a rule that causes that assertion failure:

yara APT_APT41_RevokedCert_Aug19_1.yar c57cb8bb5996c484a4001625217e02ec
Assertion failed: __EX, file  object.c, line 410
IOT/Abort trap (Speicherabzug geschrieben)

APT_APT41_RevokedCert_Aug19_1.yar:

rule APT_APT41_RevokedCert_Aug19_1 {
   meta:
      description = "Detects revoked certificates used by APT41 group"
      author = "Florian Roth"
      reference = "https://www.fireeye.com/blog/threat-research/2019/08/apt41-dual-espionage-and-cyber-crime-operation.html"
      date = "2019-08-07"
      score = 60
   condition:
      uint16(0) == 0x5a4d and
      for any i in (0 .. pe.number_of_signatures) : (
         pe.signatures[i].serial == "0b:72:79:06:8b:eb:15:ff:e8:06:0d:2c:56:15:3c:35" or
         pe.signatures[i].serial == "63:66:a9:ac:97:df:4d:e1:73:66:94:3c:9b:29:1a:aa" or
         pe.signatures[i].serial == "01:00:00:00:00:01:30:73:85:f7:02" or
         pe.signatures[i].serial == "14:0d:2c:51:5e:8e:e9:73:9b:b5:f1:b2:63:7d:c4:78" or
         pe.signatures[i].serial == "7b:d5:58:18:c5:97:1b:63:dc:45:cf:57:cb:eb:95:0b" or
         pe.signatures[i].serial == "53:0c:e1:4c:81:f3:62:10:a1:68:2a:ff:17:9e:25:80" or
         pe.signatures[i].serial == "54:c6:c1:40:6f:b4:ac:b5:d2:06:74:e9:93:92:c6:3e" or
         pe.signatures[i].serial == "fd:f2:83:7d:ac:12:b7:bb:30:ad:05:8f:99:9e:cf:00" or
         pe.signatures[i].serial == "18:63:79:57:5a:31:46:e2:6b:ef:c9:0a:58:0d:1b:d2" or
         pe.signatures[i].serial == "5c:2f:97:a3:1a:bc:32:b0:8c:ac:01:00:59:8f:32:f6" or
         pe.signatures[i].serial == "4c:0b:2e:9d:2e:f9:09:d1:52:70:d4:dd:7f:a5:a4:a5" or
         pe.signatures[i].serial == "58:01:5a:cd:50:1f:c9:c3:44:26:4e:ac:e2:ce:57:30" or
         pe.signatures[i].serial == "47:6b:f2:4a:4b:1e:9f:4b:c2:a6:1b:15:21:15:e1:fe" or
         pe.signatures[i].serial == "30:d3:c1:67:26:5b:52:0c:b8:7f:25:84:4f:95:cb:04" or
         pe.signatures[i].serial == "1e:52:bb:f5:c9:0e:c1:64:d0:5b:e0:e4:16:61:52:5f" or
         pe.signatures[i].serial == "25:f8:78:22:de:56:d3:98:21:59:28:73:ea:09:ca:37" or
         pe.signatures[i].serial == "67:24:34:0d:db:c7:25:2f:7f:b7:14:b8:12:a5:c0:4d"
      )
}
plusvic commented 4 years ago

$4 = {canary = 0, type = 0 '\000', identifier = 0x0, parent = 0x0, data = 0x0, value = {i = 0, d = 0, p = 0x0, o = 0x0, s = 0x0, ss = 0x0, re = 0x0}} indicates that the pointer to the object is wrong, it seems to be pointing to a memory area full of zeroes.

My guess is that somehow, during the execution of the YARA's VM code, the ip pointer ends up with a wrong value. This can lead either to a wrong opcode (which would trigger the assertion in exec.c) or to a wrong object pointer (which would trigger the assertion in object.c).

Can you navigate to stack frame #6 and dump the value of the ip and identifiers variables?

plusvic commented 4 years ago

Another question, with...

yara APT_APT41_RevokedCert_Aug19_1.yar c57cb8bb5996c484a4001625217e02ec

...the error occurs always, or it still happens randomly?

Codehardt commented 4 years ago

Can you navigate to stack frame #6 and dump the value of the ip and identifiers variables?

(gdb) p ip
$1 = (const uint8_t *) 0x11002e293 "\021\r"
(gdb) p *ip
$3 = 17 '\021'
(gdb) p identifier
$2 = 0x113004149 "serial"

Another question, with...

yara APT_APT41_RevokedCert_Aug19_1.yar c57cb8bb5996c484a4001625217e02ec

...the error occurs always, or it still happens randomly?

Yes, I forgot to mention that this command always triggers the assertion failure.

It was sporadic when using the cgo bindings for Golang: https://github.com/hillu/go-yara but I think we can ignore this for now.

plusvic commented 4 years ago

The ip pointer seems to be right, and the opcode is correct. Can you share the core dump with me? I'm not sure if I'm going to be able to read it with gdb in my machine, but let's try.

Codehardt commented 4 years ago

Lets try: http://codehardt.de/core.gz

plusvic commented 4 years ago

I forgot that I need the executable binary for loading the core dump in gdb. Can you share the yara binary as well?

Codehardt commented 4 years ago

Ofc, your right: http://codehardt.de/yara.gz

plusvic commented 4 years ago

No luck, gdb complains with:

Symbol format aix5coff64-rs6000 unknown

This is going to be painful 😣

plusvic commented 4 years ago

My first recommendation is start editing the rule and reduce it to the bare minimum that reproduces the issue. For example start by removing the uint16(0) == 0x5a4d and line, and then remove lines from condition inside the loop. The idea is reducing the VM code as much as possible.

plusvic commented 4 years ago

Another useful test is trying with the most recent code, and with one or two prior versions.

Codehardt commented 4 years ago

My first recommendation is start editing the rule and reduce it to the bare minimum that reproduces the issue. For example start by removing the uint16(0) == 0x5a4d and line, and then remove lines from condition inside the loop. The idea is reducing the VM code as much as possible.

The most bare minimum rule that I was able to create is:

import "pe"

rule PEAlgo {
   condition:
      pe.signatures[0].algorithm == ""
}

With another rule I was able to find out that pe.number_of_signatures equals 1.

Another useful test is trying with the most recent code, and with one or two prior versions.

Same behavior with YARA 3.9.0, YARA 3.10.0 and YARA Master (1eb1ff7).

plusvic commented 4 years ago

Next thing is making sure that the pointer returned by yr_object_array_get_item in https://github.com/VirusTotal/yara/blob/b9f925bb4e2b998bd6bb2f2e3cc2087c62fdd5b9/libyara/exec.c#L630

Is the same one you get in`: https://github.com/VirusTotal/yara/blob/b9f925bb4e2b998bd6bb2f2e3cc2087c62fdd5b9/libyara/exec.c#L576

We want to verify that index lookup operation pe.signatures[0] returns the same object that it's failing when we try to access the ".algorithm" member.

You can simply add a printf after yr_object_array_get_item.

Codehardt commented 4 years ago
yr_object_lookup_field = 11004ed70
yr_object_array_get_item = 11005e630

Seems like they are not addressing the same object 😕

plusvic commented 4 years ago

Notice that yr_object_look_field should be called two times for this rule:

rule PEAlgo {
   condition:
      pe.signatures[0].algorithm == ""
}

The second one is the interesting one. Can you confirm that the second call to yr_object_look_field receives the pointer returned by yr_object_array_get_item? If that's the case the problem is stranger than expected.

plusvic commented 4 years ago

@Codehardt any new finding?