IntelLabs / kAFL

A fuzzer for full VM kernel/driver targets
https://intellabs.github.io/kAFL/
MIT License
625 stars 88 forks source link

Memory Leak(?) problem with usermode harness #271

Open hyjun0407 opened 5 months ago

hyjun0407 commented 5 months ago

Hi, I looked at the example of the KAFL UserMode harness and used the code below to fuzz Defender. By the way, after some time, QEMU will be shut down due to low memory. (Actual Ram 64gb, Swap file 50gb) This is quite weird. As far as I know, fuzzer is returning qemu to the time it called LOCK Hypercall, and working again, so suppose it's not affected by functions such as memory free or freelibrary.(because it will be restore to when I called Lock Hypercall) And no payload will have more than 500mb of memory usage on the host(on my real compute). Why is this happening? Below is my code. It's weird because I didn't get this memory related problem when I tested the Windows driver and just as an example. Am I doing something wrong?

my parameter was : kafl fuzz -workdir ./work --seed-dir ./seed --redqueen -p 24

dunno why but if i paste my code to this issue section directly, then indent make weird. So I paste it to pastebin

Wenzel commented 5 months ago

Hi @hyjun0407 and thanks for your bug report,

I've never experienced a memory leak before, but i haven't used the harness extensively on Windows as well.

Can you run QEMU monitored by valgrind, and see if you can locate the memory leak in the code ?

hyjun0407 commented 5 months ago

Can U lemme know what is valgrind and how can i find memory leak with that?

hyjun0407 commented 5 months ago

And... there's a lot of memory leaks in the first place, for example: kAFL_payload* payload_buffer = (kAFL_payload*)VirtualAlloc(0, host_config.payload_buffer_size, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); Payloadbuffer is never freed by harness. and also for range buffer kAFL_ranges* range_buffer = (kAFL_ranges*)VirtualAlloc(NULL, 0x1000, MEM_COMMIT, PAGE_READWRITE); memset(range_buffer, 0xff, 0x1000); Other than that, never unlocks after locked tracked section(VirtualLock), etc.. there are a lot of problems with the code. However, kafl returns qemu's status to snapshot when "LOCK" called, so I don't think it'll matter if I don't free or unlock memory. Am I wrong? After release, back to snapshot, memory allocation remains the same?

hyjun0407 commented 4 months ago

I upgraded the ram to 128gb more, but QEMU is still getting 15GB+ of memory allocation per instance. KakaoTalk_20240208_035519339 KakaoTalk_20240208_035556282 I have researched about valgrind and was going to compile QEMU-NYX for valgrind and try it, but I was wondering how I can run qemu through valgrind when running qemu from kafl as I need to check for memory leaks during fuzzing. The dll we're currently targeting for fuzzing is about 11MB in size (I honestly don't think it's because of the size of the dll) and we're locking the .text section of that dll with VirtualLock. And on top of that, if you look at the command line of the running instance of QEMU, it's -m 4096 (memory allocated to 4gb), so I'm not sure how it can go up to 20gb (a Scenario that make sense is that it's allocated up to 4gb and then SetWorkingSetPage fails. not allocate memory more than 4gb..). By the way, I was going to delete my current code snippet once this issue is resolved, but I didn't realize you would post it separately, could you please delete my code snippet in your reply?

hyjun0407 commented 4 months ago

Hi @hyjun0407 and thanks for your bug report,

I've never experienced a memory leak before, but i haven't used the harness extensively on Windows as well.

dunno why but if i paste my code to this issue section directly, then indent make weird. So I paste it to pastebin

Just put your code in a markdown code block, it will keep indention and have syntax highlight:

Your snippet Can you run QEMU monitored by valgrind, and see if you can locate the memory leak in the code ?

I would appreciate it if you could delete the snippet from this part

hyjun0407 commented 4 months ago

https://github.com/IntelLabs/kAFL/issues/243#issuecomment-1757657680 This issue is also fundamentally caused by RAM being allocated further beyond the -m 10G selected as the maximum allocation, causing slaves to die as memory goes to the SWAP file (when memory utilization is 98%, memory will be paged(swap) out, it kills slaves one by one to bring memory usage to 80%, then memory usuage increases to 98% again and kills another slave.. loop and finally all worker will be stalled).

Wenzel commented 4 months ago

@hyjun0407 I've removed your code snippet from my comment, but there is still code present in the multiple edits of your first message. If you don't want to publish your code, just say it upfront, no need to use a weird indentation excuse.

hyjun0407 commented 4 months ago

@hyjun0407 I've removed your code snippet from my comment, but there is still code present in the multiple edits of your first message. If you don't want to publish your code, just say it upfront, no need to use a weird indentation excuse.

Whatever the reason, we apologize for the hassle, and if you look at the first revision, you'll notice that it's not highlighted and is strangely indented. I didn't write the code on my own, I just posted code with my quick thoughts and a coworker asked me to remove it. Whatever the reason, if it came across as rude, I'd like to say it wasn't my intention. I'm so Sorry about it.

p0w1 commented 2 months ago

@hyjun0407 I had a similar problem with the memory consumption of certain VMs. Could you upload the included header files of your code to test it or send it privately if you prefer?