nuta / resea

A microkernel-based hackable operating system.
Other
522 stars 29 forks source link

Introduce Kernel Address Sanitizer #29

Open nuta opened 4 years ago

nuta commented 4 years ago

Kernel Address Sanitizer (KASAN) is a runtime memory error (e.g. use-after-free) checker. While it is "kernel" address sanitizer, we can use it in the userspace.

Briefly speaking, when KASAN is enabled, the compiler inserts code to call hook functions (__asan_store8_noabort) before each memory access (e.g. *ptr = 1;). KASAN runtime (what we need to implement) is responsible for tracking how each memory bytes are valid.

You don't need to implement as described in the paper. Just use it as memory access hooks.

Enabling KASAN

Add the following compiler options to $CFLAGS:

--target=x86_64-pc-linux-elf
-fsanitize=undefined,kernel-address
-mllvm -asan-instrumentation-with-call-threshold=0
-mllvm -asan-globals=false
-mllvm -asan-stack=false
-mllvm -asan-use-after-return=false
-mllvm -asan-use-after-scope=false

Implementation Plan

Implement the KASAN runtime in libs/common.

// Stores the current state of the each memory bytes.
uint8_t shadow[NUM_BYTES /* .bss size + .data size + heap size */];

#ifdef KERNEL
// We don't support KASan in kernel space for now.
void __asan_load8_noabort(vaddr_t addr) {
}
#else
void __asan_load8_noabort(vaddr_t addr) {
    if (!shadow[addr]) {
        PANIC("ASan: detected an invalid access to %p", addr);
    }
}
#endif

Furthermore, you need to update shadow in malloc, free, and functions written in assembly like memcpy.

Good References

yashrajkakkad commented 4 years ago

Thanks for the wonderful explanation! I went through the paper and I came up with the following roadmap:

Changing malloc and free would take care of sanitizing the heap part. What about the stack?

nuta commented 4 years ago

Changing malloc and free would take care of sanitizing the heap part. What about the stack?

Good point. We need to update shadow for already initialized memory areas (.text, .rodata, stack, ...) before enabling the KASAN runtime.

FYI, the stack area can be located by __stack and __stack_end symbols defined in the linker script and they can be used as global variables:

extern char __stack[];
extern char __stack_end[];
nuta commented 4 years ago

Regarding NUM_BYTES, use a large enough hard-coded value (say 512KiB) for now.

yashrajkakkad commented 3 years ago

I started off with trying to write a simple implementation for the heap.

void shadow_chunk(struct malloc_chunk *chunk)
{
//Let's assume this function will take a chunk and mark appropriate places in shadow. For example - 
        paddr_t ptr_next = chunk->next;
        for(size_t i = 0; i < 4; i++, ptr_next++)
        {
            shadow[ptr_next>>3] = SHADOW_UNADDRESSABLE;
        }

}

Will a chunk's address be a perfect multiple of 8 bytes? If that is not the case, I have to be very careful with updating the values of shadow[addr>>3].

If it is so, every variable in the structure will have addresses multiples of 8 (obviously) and we can have different shadow values corresponding to different variables.

#define SHADOW_NEXT_PTR -1
#define SHADOW_CAPACITY -2
#define SHADOW_SIZE -3
#define SHADOW_MAGIC -4
#define SHADOW_UNDERFLOW_REDZONE -5
#define SHADOW_DATA -6
nuta commented 3 years ago

Will a chunk's address be a perfect multiple of 8 bytes? If that is not the case, I have to be very careful with updating the values of shadow[addr>>3].

Yes. It shall be aligned to 16 bytes since SSE instructions require the alignment.

By the way, how about stop doing >> 3? I haven't checked but I believe typical servers and apps on Resea consume only 1MiB or less so I think we don't need to save memory.

yashrajkakkad commented 3 years ago

As we're having bits in every shadow array element, they're more than sufficient to store all the possible states of the corresponding 8 bytes, including our custom defined invalid-address states if required. I believe this shouldn't add much to the complexity/simplicity.

Whatever you agree with, do tell me and I'll proceed that way

nuta commented 3 years ago

I've reread the ASan wiki and finally I understood how we can have multiple states.

By the way, how about stop doing >> 3? I haven't checked but I believe typical servers and apps on Resea consume only 1MiB or less so I think we don't need to save memory.

Please ignore this my comment. malloc guarantees that addresses are 16-bytes aligned so your proposal looks feasible to me.

yashrajkakkad commented 3 years ago

Thanks for the green signal :)

yashrajkakkad commented 3 years ago

I have a couple of points -

  1. Introducing these flags in the Makefile is causing build errors. Please tell me if I did something wrong here.

  2. I think we will need > 512 KB space for shadow, unless we smartly introduce a relative offset. From what I see right now, memory addresses of chunks are in the order of 6 x 10^7. Dividing it by 8 is roughly 8 x 10^6. So should we go for around 8MB for a start? This is huge right? Seems like I am thinking in the wrong direction

yashrajkakkad commented 3 years ago

Perhaps we need to map the regions of the heap, stack etc. appropriately into shadow memory. I went through libs/resea/arch and from what I understand the starting position of heap region in the memory is dependent on the system architecture. Does it mean that we need different implementations for different architectures?

yashrajkakkad commented 3 years ago

Please ignore all the messages above except for the flags point (#1). I figured out that using __heap and __heap_end will do the job.

nuta commented 3 years ago

Does it mean that we need different implementations for different architectures?

No we don't have to. The starting address of the heap region, etc. are arch-dependent as you noticed, but they're defined in the linker script (example). Thus you can determine the addresses as an extern symbol:

extern char __heap[];
extern char __heap_end[];

vaddr_t heap_start(void) {
    return (vaddr_t) __heap;
}
yashrajkakkad commented 3 years ago

Yes, @nuta. I figured that out and wrote a simple implementation for the heap as well. It was fairly straightforward as everything was properly aligned to 8-bytes. I wrote these functions in malloc.c. These functions shadow_malloc and shadow_free are invoked in malloc and free respectively. I did check memcpy but I am not sure what exactly is to be done there - Should I copy the shadow value of the src address to dst address? Are these also aligned to 8-bytes?

Also, it'd really help if you can throw some light on the stack part, or redirect me to the right files to go through and understand :smile:

nuta commented 3 years ago

I did check memcpy but I am not sure what exactly is to be done there - Should I copy the shadow value of the src address to dst address? Are these also aligned to 8-bytes?

No. A good example is a string:

a = "sample string";
b = malloc(8);
memcpy(&b[1], a, 1);

I guess only b[1] become accessible while the current shadow memory implementation cannot handle this situation, right? Hmm, I have no idea for now. We need to take a look at existing ASan runtime implementations.

Also, it'd really help if you can throw some light on the stack part, or redirect me to the right files to go through and understand 😄

As the first step, let's ignore stack variable accesses unless it causes a problem for now (IIRC we can turn it off by compiler options).

yashrajkakkad commented 3 years ago

I guess only b[1] become accessible while the current shadow memory implementation cannot handle this situation, right? Hmm, I have no idea for now. We need to take a look at existing ASan runtime implementations.

As of the current implementation, only the data region will be accessible. What is the memory region succeeded by? We do not have an overflow red-zone in malloc right?

Should we initialize shadow with -1 on startup? This might help handle the situation.

yashrajkakkad commented 3 years ago

I took help from this page and the OP-TEE's library to implement the hook functions. I have integrated it and everything seemed good until one of the assertions started to fail -

[kernel] kernel/task.c:343: ASSERTION FAILURE: CURRENT->pager != NULL
[kernel] WARN: Backtrace:
[kernel] WARN:     #0: ffff800000304848 handle_page_fault()+0x508
[kernel] WARN:     #1: ffff80000030c741 x64_handle_interrupt()+0x1d1
[kernel] WARN:     #2: ffff80000030e360 interrupt_common()+0x30
[kernel] WARN:     #3: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN:     #4: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN:     #5: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN:     #6: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN:     #7: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN:     #8: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN:     #9: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN:     #10: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN:     #11: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN:     #12: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN:     #13: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN:     #14: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN:     #15: 000000000100ca2e long_mode_in_low_address()+0xf0c8be

I have realized that the problem occurs when I try to access addr variable in check_address() function. Is some page fault detection mechanism interfering?

nuta commented 3 years ago

As of the current implementation, only the data region will be accessible. What is the memory region succeeded by? We do not have an overflow red-zone in malloc right?

We have overflow redzones in malloc (underflow_redzone and overflow_redzone).

Should we initialize shadow with -1 on startup? This might help handle the situation.

What does -1 stands for in your implementation? SHADOW_NEXT_PTR?

I have realized that the problem occurs when I try to access addr variable in check_address() function. Is some page fault detection mechanism interfering?

That's caused by accessing an invalid memory access in the first task (i.e. servers/vm). By the way, where is check_address?

yashrajkakkad commented 3 years ago

We have overflow redzones in malloc (underflow_redzone and overflow_redzone).

overflow_redzone is commented out when I saw last

What does -1 stands for in your implementation? SHADOW_NEXT_PTR?

A negative value means that it is unaddressable. -1 just means unaddressable. I've set SHADOW_NEXT_PTR and others from -2 onwards.

That's caused by accessing an invalid memory access in the first task (i.e. servers/vm). By the way, where is check_address?

I was referring to my implementation. Do take a glance if possible. I have also updated malloc accordingly.

nuta commented 3 years ago

Thanks for sharing. I'll have a look at later!

overflow_redzone is commented out when I saw last

It is commented out because the length of data[] is arbitrary. The overflow redzone is filled with MALLOC_REDZONE_OVRFLOW_MARKER.

yashrajkakkad commented 3 years ago

Thanks for your time :). I'll try to still figure out myself why this paging error occurs.

nuta commented 3 years ago

FWIW, here's a tip: if I were you, I'd debug unexpected page faults occurred in vm server in following steps:

  1. Print IP and the inaccessible memory address in kernel/task.c:343.
  2. Run addr2line -e build/vm.debug.elf <IP>. It should print the source location where the page fault occurred.
  3. If needed, disassemble the ELF file using radare2 or objdump to investigate the root cause.

Hope it help.

yashrajkakkad commented 3 years ago

Thanks for the tip! I did exactly that and this reveals the following -

It didn't make a lot of sense to me, at least not yet. check_address is not performing memory access at all. It just checks shadow. (I removed all DBG/PANIC statements also but didn't help)

The strange thing is that this error occurs even if I don't access addr. Just to check, I wrote one DBG statement and nothing else in check_address and still, the same error occurs.

I am investigating further. If something clicks after reading above, please do suggest.

nuta commented 3 years ago

Might be not related to that bug, I noticed some pitfalls in your source code:

yashrajkakkad commented 3 years ago
  • In kasan.h, shadow is declared as a global variable. It should be extern.

Thanks for pointing. I fixed that for the time being by changing user.ld for x86 like:

    . = 0x04000000;
    __zeroed_pages = .;
    .bss : ALIGN(0x1000) {
        __heap = .;
        . += 0x100000;
        __heap_end = .;

        . = ALIGN(0x1000);
        __shadow = .;
        . += 0x24000;
        __shadow_end = .;

        . = ALIGN(0x1000);
        __bss = .;
        *(.bss);
        *(.bss.*);
        __bss_end = .;

    } :bss

This didn't fix the problem though. After lots of experimentation, I feel this is due to something external as even keeping ONLY a DBG statement is inducing a page fault.

  • What if __asan_* is called when it accesses the shadow memory? I guess you need a boolean variable (something like bool kasan_temporarily_disabled) to ignore such memory accesses.

From what I could understand, the paper suggests marking the shadow memory region as "bad". I am not fully sure why they do that. Will the compiler also insert hook functions for shadow memory accesses?

nuta commented 3 years ago

Will the compiler also insert hook functions for shadow memory accesses?

I checked the OP-TEE's implementation but it seems we don't have to care about that since they do nothing for that, like using a function attribute. Just ignore that comment.