tklengyel / drakvuf

DRAKVUF Black-box Binary Analysis
https://drakvuf.com
Other
1.02k stars 247 forks source link

[LIBHOOK] makes the xen virtual machine hang #1740

Open carttam opened 8 months ago

carttam commented 8 months ago

Hi, I ran Drakvuf with Procmon and Apimon plugins on a Windows 7 SP1 virtual machine with a sample malware that I found in MalwareBazaar. After a long while after default browser (IE) openned , the Xen virtual machine hung and froze, and even the xl destroy command did not work. So, I had to kill the QEMU process to force it to stop. xl list result:

Name                                        ID   Mem VCPUs  State   Time(s)
Domain-0                                     0  8191     4     r-----    4045.7
(null)                                      21    20     2     --p--d     218.6

Here is the time of execution of malware stderr log for both runs: trace1 trace2

1699362400.234058 [LIBHOOK] creating return hook
1699362400.234193 Breakpoint VA 0x7fef4fb5075 -> PA 0x6d454075
1699362400.234300 Copied trapped page to new location
1699362400.234318 Activating remapped gfns in the altp2m views!
1699362400.234395       Trap added @ PA 0x6d454075 RPA 0xff373075 Page 447572 for GetSystemMetrics.
1699362400.234416 [LIBHOOK] return hook OK
1699362400.234431 Switching altp2m and to singlestep on vcpu 1
1699362400.234593       Trap added @ PA 0x27059d3 RPA 0xff00e9d3 Page 9989 for NtProtectVirtualMemory ret.
1699362400.234649 Switching altp2m and to singlestep on vcpu 2
1699362400.234826 [LIBHOOK] destroying return hook...
1699362400.234839 Removing breakpoint trap from 0x6d454075.
1699362400.234919 Removed memtrap for GFN 0x6d454 in altp2m view 1
1699362400.234932 Removed memtrap for GFN 0xff373 in altp2m view 1
1699362400.235305 [USERHOOK] DLL 928!7feff000000 is already hooked
1699362400.235328 Removing breakpoint trap from 0x27059d3.
1699362400.238128 [LIBHOOK] creating return hook
1699362400.238208 Breakpoint VA 0x7fefd7333d0 -> PA 0x2911c3d0
1699362400.238363 Copied trapped page to new location
1699362400.238395 Activating remapped gfns in the altp2m views!
1699362400.238511       Trap added @ PA 0x2911c3d0 RPA 0xff1f73d0 Page 168220 for LdrGetProcedureAddress.
1699362400.238549 [LIBHOOK] return hook OK
1699362400.238569 Switching altp2m and to singlestep on vcpu 2
1699362400.238701 Pre mem cb with vCPU 2 @ 0x2becb0c4 in view 1: r--
1699362400.238735 Switching to altp2m view 0 on vCPU 2 and waiting for post_mem cb
1699362400.238876 Post mem cb @ 0x2becb0c4 vCPU 2 altp2m 0
1699362400.239177 [LIBHOOK] destroying return hook...
1699362400.239261 Removing breakpoint trap from 0x2911c3d0.
1699362400.239352 Removed memtrap for GFN 0x2911c in altp2m view 1
1699362400.239376 Removed memtrap for GFN 0xff1f7 in altp2m view 1
1699362400.241604       Trap added @ PA 0x27059d3 RPA 0xff00e9d3 Page 9989 for NtProtectVirtualMemory ret.
1699362400.241659 Switching altp2m and to singlestep on vcpu 2
1699362400.241762 Switching altp2m and to singlestep on vcpu 1
1699362400.242005 [USERHOOK] DLL 928!7feff000000 is already hooked
1699362400.242029 Removing breakpoint trap from 0x27059d3.
1699362400.242275 [LIBHOOK] creating return hook
1699362400.242380 Breakpoint VA 0x7fef6f8b8ca -> PA 0x130c18ca
1699362400.242444 Copied trapped page to new location
1699362400.242458 Activating remapped gfns in the altp2m views!
1699362400.242534       Trap added @ PA 0x130c18ca RPA 0xff29c8ca Page 78017 for RegOpenKeyExA.
1699362400.242564 [LIBHOOK] return hook OK
1699362400.242577 Switching altp2m and to singlestep on vcpu 2
1699362400.242706 [LIBHOOK] creating return hook
1699362400.242807 Breakpoint VA 0x7feff61d6c3 -> PA 0x214416c3
1699362400.242878       Trap added @ PA 0x214416c3 RPA 0xff2016c3 Page 136257 for RegOpenKeyExA.
1699362400.242907 [LIBHOOK] return hook OK
1699362400.242916 [LIBHOOK] destroying return hook...
1699362400.242928 Removing breakpoint trap from 0x130c18ca.
1699362400.242985 Removed memtrap for GFN 0x130c1 in altp2m view 1
1699362400.243007 Removed memtrap for GFN 0xff29c in altp2m view 1
1699362400.243025 Switching altp2m and to singlestep on vcpu 2
1699362400.243056 [LIBHOOK] creating return hook
1699362400.243103 Breakpoint VA 0x7fef6f8b98e -> PA 0x130c198e
1699362400.243153 Copied trapped page to new location
1699362400.243171 Activating remapped gfns in the altp2m views!
1699362400.243240       Trap added @ PA 0x130c198e RPA 0xff29c98e Page 78017 for RegOpenKeyExW.
1699362400.243260 [LIBHOOK] return hook OK
1699362400.243271 Switching altp2m and to singlestep on vcpu 1
1699362400.244567 Pre mem cb with vCPU 2 @ 0x2becb110 in view 1: r--
1699362400.244583 Switching to altp2m view 0 on vCPU 2 and waiting for post_mem cb
1699362400.244711 Post mem cb @ 0x2becb110 vCPU 2 altp2m 0
1699362400.245242 [LIBHOOK] destroying return hook...
1699362400.245269 Removing breakpoint trap from 0x214416c3.
1699362400.245401 [LIBHOOK] creating return hook
1699362400.245481 Breakpoint VA 0x7feff042bc5 -> PA 0x175c7bc5
1699362400.245537 Copied trapped page to new location
1699362400.245554 Activating remapped gfns in the altp2m views!
1699362400.245626       Trap added @ PA 0x175c7bc5 RPA 0xff308bc5 Page 95687 for RegQueryValueExA.
1699362400.245655 [LIBHOOK] return hook OK
1699362400.245668 Switching altp2m and to singlestep on vcpu 2
1699362400.245783 [LIBHOOK] creating return hook
1699362400.245846 Breakpoint VA 0x7feff62407d -> PA 0x2853807d
1699362400.245929       Trap added @ PA 0x2853807d RPA 0xff20207d Page 165176 for RegQueryValueExA.
1699362400.245952 [LIBHOOK] return hook OK
1699362400.245960 [LIBHOOK] destroying return hook...
1699362400.245971 Removing breakpoint trap from 0x175c7bc5.
1699362400.246024 Removed memtrap for GFN 0x175c7 in altp2m view 1
1699362400.246039 Removed memtrap for GFN 0xff308 in altp2m view 1
1699362400.246063 Switching altp2m and to singlestep on vcpu 2
1699367808.333386 [LIBHOOK] creating return hook
1699367808.333455 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.333520 Copied trapped page to new location
1699367808.333547 Activating remapped gfns in the altp2m views!
1699367808.333615       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.333654 [LIBHOOK] return hook OK
1699367808.333675 Switching altp2m and to singlestep on vcpu 0
1699367808.333684 Pre mem cb with vCPU 1 @ 0x42812404 in view 1: r--
1699367808.333699 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.333822 [LIBHOOK] destroying return hook...
1699367808.333852 Removing breakpoint trap from 0x749bb7fe.
1699367808.333927 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.333958 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.334010 Post mem cb @ 0x42812404 vCPU 1 altp2m 0
1699367808.334727 [LIBHOOK] creating return hook
1699367808.334834 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.334925 Copied trapped page to new location
1699367808.334953 Activating remapped gfns in the altp2m views!
1699367808.335023       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.335055 [LIBHOOK] return hook OK
1699367808.335079 Switching altp2m and to singlestep on vcpu 0
1699367808.335088 Pre mem cb with vCPU 1 @ 0x42812408 in view 1: r--
1699367808.335105 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.335219 [LIBHOOK] destroying return hook...
1699367808.335241 Removing breakpoint trap from 0x749bb7fe.
1699367808.335318 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.335341 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.335384 Post mem cb @ 0x42812408 vCPU 1 altp2m 0
1699367808.336017 [LIBHOOK] creating return hook
1699367808.336130 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.336221 Copied trapped page to new location
1699367808.336254 Activating remapped gfns in the altp2m views!
1699367808.336338       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.336378 [LIBHOOK] return hook OK
1699367808.336424 Switching altp2m and to singlestep on vcpu 0
1699367808.336610 [LIBHOOK] destroying return hook...
1699367808.336647 Removing breakpoint trap from 0x749bb7fe.
1699367808.336720 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.336753 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.337614 [LIBHOOK] creating return hook
1699367808.337686 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.337756 Copied trapped page to new location
1699367808.337784 Activating remapped gfns in the altp2m views!
1699367808.337854       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.337895 [LIBHOOK] return hook OK
1699367808.337908 Switching altp2m and to singlestep on vcpu 0
1699367808.337917 Pre mem cb with vCPU 1 @ 0x4281240c in view 1: r--
1699367808.337934 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.338123 [LIBHOOK] destroying return hook...
1699367808.338150 Removing breakpoint trap from 0x749bb7fe.
1699367808.338228 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.338255 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.338295 Post mem cb @ 0x4281240c vCPU 1 altp2m 0
1699367808.339208 [LIBHOOK] creating return hook
1699367808.339279 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.339346 Copied trapped page to new location
1699367808.339373 Activating remapped gfns in the altp2m views!
1699367808.339452       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.339483 [LIBHOOK] return hook OK
1699367808.339495 Switching altp2m and to singlestep on vcpu 0
1699367808.339621 [LIBHOOK] destroying return hook...
1699367808.339653 Removing breakpoint trap from 0x749bb7fe.
1699367808.339716 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.339747 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.340169 [LIBHOOK] creating return hook
1699367808.340238 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.340288 Copied trapped page to new location
1699367808.340296 Activating remapped gfns in the altp2m views!
1699367808.340366       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.340396 [LIBHOOK] return hook OK
1699367808.340407 Switching altp2m and to singlestep on vcpu 0
1699367808.340430 Pre mem cb with vCPU 1 @ 0x42812410 in view 1: r--
1699367808.340446 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.340558 [LIBHOOK] destroying return hook...
1699367808.340589 Removing breakpoint trap from 0x749bb7fe.
1699367808.340657 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.340688 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.340722 Post mem cb @ 0x42812410 vCPU 1 altp2m 0
1699367808.341184 [LIBHOOK] creating return hook
1699367808.341252 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.341313 Copied trapped page to new location
1699367808.341340 Activating remapped gfns in the altp2m views!
1699367808.341425       Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.341455 [LIBHOOK] return hook OK
1699367808.341466 Switching altp2m and to singlestep on vcpu 0
1699367808.341475 Pre mem cb with vCPU 1 @ 0x42812414 in view 1: r--
1699367808.341491 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.341620 Post mem cb @ 0x42812414 vCPU 1 altp2m 0
1699367808.341895 Pre mem cb with vCPU 1 @ 0x42812418 in view 1: r--
1699367808.341930 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342036 Post mem cb @ 0x42812418 vCPU 1 altp2m 0
1699367808.342155 Pre mem cb with vCPU 1 @ 0x4281241c in view 1: r--
1699367808.342190 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342303 Post mem cb @ 0x4281241c vCPU 1 altp2m 0
1699367808.342422 Pre mem cb with vCPU 1 @ 0x42812420 in view 1: r--
1699367808.342456 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342558 Post mem cb @ 0x42812420 vCPU 1 altp2m 0
1699367808.342690 Pre mem cb with vCPU 1 @ 0x42812424 in view 1: r--
1699367808.342725 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342778 [LIBHOOK] destroying return hook...
1699367808.342808 Removing breakpoint trap from 0x749bb7fe.
1699367808.342880 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.342907 Removed memtrap for GFN 0xff0c2 in altp2m view 1
ubersandro commented 7 months ago

Hello, I am writing here since I was on the point of starting a new issue but maybe we have the same problem. I am experiencing domain freezing while running Codemon for monitoring the whole userspace in Windows 10 20H1 and my output look really similar to that above. It happens sometimes and, at the moment, I cannot really reproduce arbitrarily the error. I suppose there is some trouble in managing events. My guess is that some event is not correctly handled because of some sort of lack of atomicity in removing/adding events and the domain is suspended during singlestepping but I have no idea on how to verify that this is the case. Thanks in advance for the help, Alessandro

tklengyel commented 7 months ago

Debugging that type of error is really difficult. What may help is to verify if this is a new issue or if you had the same problem with older versions. If its an issue only happening with a newer version then some recent change might have broke the logic to fix, which should easier. If its happening with older versions as well, then the logic was already broken and its much harder to figure out why.

ubersandro commented 7 months ago

Ok ok, I would like to try to debug it but I am not very proficient yet working with Xen. As I was saying, my suspicion is that event management is somehow broken. Maybe passing through the vm_event interface I could figure out what makes my domU hang dumping events and checking which one is not managed by the stack libvmi+drakvuf+codemon. As an alternative, I could try to write a more concise stress test for memaccess events to try to understand what is wrong. Do you have any advice for me, Tamas?

carttam commented 7 months ago

Debugging that type of error is really difficult. What may help is to verify if this is a new issue or if you had the same problem with older versions. If its an issue only happening with a newer version then some recent change might have broke the logic to fix, which should easier. If its happening with older versions as well, then the logic was already broken and its much harder to figure out why.

I tested version 1.0 and this problem was also present. I noticed that by setting PRINT_DEBUG, the output of the callback event and struct event was similar to the previous times it was called. With many tests, I could not find any properties under which this error occurs. I just realized that if, for example, in previous executions of Xen, after the ReturnHook that was frozen in the chrome.exe process, I only filter (running Drakvuf with -C --context-process chrome.exe), Xen does not freeze. I tried to create a problem like the current state by breaking the code, such as changing the event output value or changing event->interrupt_event.reinject or drakvuf->in_callback, all of which led to the crash of Drakvuf itself and Xen did not freeze. Has such a problem happened before? Or do you know the reasons that can cause this problem? Thank you for your great project, I hope it can be solved

carttam commented 7 months ago

At last, I was able to make Xen freeze at the beginning of the execution by commenting this part of the codes.

https://github.com/tklengyel/drakvuf/blob/67477d0db574c78327be598501992f402ae8f67f/src/libdrakvuf/vmi.c#L1144-L1145

https://github.com/tklengyel/drakvuf/blob/67477d0db574c78327be598501992f402ae8f67f/src/libdrakvuf/vmi.c#L1515-L1525

Amnpardaz-Hypervisor commented 7 months ago

Hello , With many tests, I realized that the problem arises from the vmi_slat_change_gfn function to change the GFN to 0, I still don't know why this happens. Anyway, using the vmi_set_mem_event function to change the access level to VMI_MEMACCESS_N solved the problem. https://github.com/tklengyel/drakvuf/blob/1859dc9657e5ccab5ce925fe60980378544f2f88/src/libdrakvuf/vmi.c#L1184-L1198 https://github.com/tklengyel/drakvuf/blob/1859dc9657e5ccab5ce925fe60980378544f2f88/src/libdrakvuf/vmi.c#L1216-L1229 for example : vmi_set_mem_event(vmi, container->memaccess.gfn, VMI_MEMACCESS_N, drakvuf->altp2m_idx)

tklengyel commented 7 months ago

Yea, don't do that. That disables the core functionality of DRAKVUF and it makes the breakpoints detectable by the guest.

yuno-x commented 3 months ago

I always encounter the same problem when I use the apimon of drakvuf. The qemu-xen logs show the following memory-related error and qemu-xen hangs. This happens in any recent version.

$ cat /var/log/xen/qemu-dm-*.log
VNC server running on :::5900
Locked DMA mapping while invalidating mapcache! 0000000000000eff -> 0x7f42f34f72e0 is present
qemu-system-i386: terminating on signal 1 from pid 24521 (xl)