Open nuta opened 4 years ago
Thanks for the wonderful explanation! I went through the paper and I came up with the following roadmap:
NUM_BYTES
shadow
elements as unallocated
shadow
in malloc
, free
and memcpy
(and maybe some other function in string.h
) corresponding to the state of that 8-byte sequence. Changing malloc
and free
would take care of sanitizing the heap part. What about the stack?
Changing malloc and free would take care of sanitizing the heap part. What about the stack?
Good point. We need to update shadow
for already initialized memory areas (.text, .rodata, stack, ...) before enabling the KASAN runtime.
FYI, the stack area can be located by __stack
and __stack_end
symbols defined in the linker script and they can be used as global variables:
extern char __stack[];
extern char __stack_end[];
Regarding NUM_BYTES
, use a large enough hard-coded value (say 512KiB) for now.
I started off with trying to write a simple implementation for the heap.
void shadow_chunk(struct malloc_chunk *chunk)
{
//Let's assume this function will take a chunk and mark appropriate places in shadow. For example -
paddr_t ptr_next = chunk->next;
for(size_t i = 0; i < 4; i++, ptr_next++)
{
shadow[ptr_next>>3] = SHADOW_UNADDRESSABLE;
}
}
Will a chunk's address be a perfect multiple of 8 bytes? If that is not the case, I have to be very careful with updating the values of shadow[addr>>3]
.
If it is so, every variable in the structure will have addresses multiples of 8 (obviously) and we can have different shadow values corresponding to different variables.
#define SHADOW_NEXT_PTR -1
#define SHADOW_CAPACITY -2
#define SHADOW_SIZE -3
#define SHADOW_MAGIC -4
#define SHADOW_UNDERFLOW_REDZONE -5
#define SHADOW_DATA -6
Will a chunk's address be a perfect multiple of 8 bytes? If that is not the case, I have to be very careful with updating the values of shadow[addr>>3].
Yes. It shall be aligned to 16 bytes since SSE instructions require the alignment.
By the way, how about stop doing >> 3
? I haven't checked but I believe typical servers and apps on Resea consume only 1MiB or less so I think we don't need to save memory.
As we're having bits in every shadow array element, they're more than sufficient to store all the possible states of the corresponding 8 bytes, including our custom defined invalid-address states if required. I believe this shouldn't add much to the complexity/simplicity.
Whatever you agree with, do tell me and I'll proceed that way
I've reread the ASan wiki and finally I understood how we can have multiple states.
By the way, how about stop doing >> 3? I haven't checked but I believe typical servers and apps on Resea consume only 1MiB or less so I think we don't need to save memory.
Please ignore this my comment. malloc
guarantees that addresses are 16-bytes aligned so your proposal looks feasible to me.
Thanks for the green signal :)
I have a couple of points -
Introducing these flags in the Makefile is causing build errors. Please tell me if I did something wrong here.
I think we will need > 512 KB space for shadow
, unless we smartly introduce a relative offset. From what I see right now, memory addresses of chunks are in the order of 6 x 10^7. Dividing it by 8 is roughly 8 x 10^6. So should we go for around 8MB for a start? This is huge right? Seems like I am thinking in the wrong direction
Perhaps we need to map the regions of the heap, stack etc. appropriately into shadow memory. I went through libs/resea/arch and from what I understand the starting position of heap region in the memory is dependent on the system architecture. Does it mean that we need different implementations for different architectures?
Please ignore all the messages above except for the flags point (#1). I figured out that using __heap
and __heap_end
will do the job.
Does it mean that we need different implementations for different architectures?
No we don't have to. The starting address of the heap region, etc. are arch-dependent as you noticed, but they're defined in the linker script (example). Thus you can determine the addresses as an extern symbol:
extern char __heap[];
extern char __heap_end[];
vaddr_t heap_start(void) {
return (vaddr_t) __heap;
}
Yes, @nuta. I figured that out and wrote a simple implementation for the heap as well. It was fairly straightforward as everything was properly aligned to 8-bytes. I wrote these functions in malloc.c. These functions shadow_malloc
and shadow_free
are invoked in malloc
and free
respectively. I did check memcpy
but I am not sure what exactly is to be done there - Should I copy the shadow value of the src
address to dst
address? Are these also aligned to 8-bytes?
Also, it'd really help if you can throw some light on the stack part, or redirect me to the right files to go through and understand :smile:
I did check
memcpy
but I am not sure what exactly is to be done there - Should I copy the shadow value of thesrc
address todst
address? Are these also aligned to 8-bytes?
No. A good example is a string:
a = "sample string";
b = malloc(8);
memcpy(&b[1], a, 1);
I guess only b[1]
become accessible while the current shadow memory implementation cannot handle this situation, right? Hmm, I have no idea for now. We need to take a look at existing ASan runtime implementations.
Also, it'd really help if you can throw some light on the stack part, or redirect me to the right files to go through and understand 😄
As the first step, let's ignore stack variable accesses unless it causes a problem for now (IIRC we can turn it off by compiler options).
I guess only
b[1]
become accessible while the current shadow memory implementation cannot handle this situation, right? Hmm, I have no idea for now. We need to take a look at existing ASan runtime implementations.
As of the current implementation, only the data region will be accessible. What is the memory region succeeded by? We do not have an overflow red-zone in malloc
right?
Should we initialize shadow
with -1 on startup? This might help handle the situation.
I took help from this page and the OP-TEE's library to implement the hook functions. I have integrated it and everything seemed good until one of the assertions started to fail -
[kernel] kernel/task.c:343: ASSERTION FAILURE: CURRENT->pager != NULL
[kernel] WARN: Backtrace:
[kernel] WARN: #0: ffff800000304848 handle_page_fault()+0x508
[kernel] WARN: #1: ffff80000030c741 x64_handle_interrupt()+0x1d1
[kernel] WARN: #2: ffff80000030e360 interrupt_common()+0x30
[kernel] WARN: #3: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN: #4: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN: #5: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN: #6: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN: #7: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN: #8: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN: #9: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN: #10: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN: #11: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN: #12: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
[kernel] WARN: #13: 000000000100a49f long_mode_in_low_address()+0xf0a32f
[kernel] WARN: #14: 000000000100d178 long_mode_in_low_address()+0xf0d008
[kernel] WARN: #15: 000000000100ca2e long_mode_in_low_address()+0xf0c8be
I have realized that the problem occurs when I try to access addr
variable in check_address()
function. Is some page fault detection mechanism interfering?
As of the current implementation, only the data region will be accessible. What is the memory region succeeded by? We do not have an overflow red-zone in malloc right?
We have overflow redzones in malloc (underflow_redzone
and overflow_redzone
).
Should we initialize shadow with -1 on startup? This might help handle the situation.
What does -1
stands for in your implementation? SHADOW_NEXT_PTR
?
I have realized that the problem occurs when I try to access addr variable in check_address() function. Is some page fault detection mechanism interfering?
That's caused by accessing an invalid memory access in the first task (i.e. servers/vm
). By the way, where is check_address
?
We have overflow redzones in malloc (
underflow_redzone
andoverflow_redzone
).
overflow_redzone
is commented out when I saw last
What does
-1
stands for in your implementation?SHADOW_NEXT_PTR
?
A negative value means that it is unaddressable. -1 just means unaddressable. I've set SHADOW_NEXT_PTR
and others from -2 onwards.
That's caused by accessing an invalid memory access in the first task (i.e.
servers/vm
). By the way, where ischeck_address
?
I was referring to my implementation. Do take a glance if possible. I have also updated malloc accordingly.
Thanks for sharing. I'll have a look at later!
overflow_redzone is commented out when I saw last
It is commented out because the length of data[]
is arbitrary. The overflow redzone is filled with MALLOC_REDZONE_OVRFLOW_MARKER
.
Thanks for your time :). I'll try to still figure out myself why this paging error occurs.
FWIW, here's a tip: if I were you, I'd debug unexpected page faults occurred in vm server in following steps:
kernel/task.c:343
.addr2line -e build/vm.debug.elf <IP>
. It should print the source location where the page fault occurred.Hope it help.
Thanks for the tip! I did exactly that and this reveals the following -
printf
/ vprintf
, depending on what I write in check_address
mov
instructions like - mov %rdx,-0x58(%rbp)
and push %r13
(I think this is the instruction where the function variables are pushed to the stack)It didn't make a lot of sense to me, at least not yet. check_address
is not performing memory access at all. It just checks shadow
. (I removed all DBG
/PANIC
statements also but didn't help)
The strange thing is that this error occurs even if I don't access addr
. Just to check, I wrote one DBG
statement and nothing else in check_address
and still, the same error occurs.
I am investigating further. If something clicks after reading above, please do suggest.
Might be not related to that bug, I noticed some pitfalls in your source code:
shadow
is declared as a global variable. It should be extern
.__asan_*
is called when it accesses the shadow memory? I guess you need a boolean variable (something like bool kasan_temporarily_disabled
) to ignore such memory accesses.
- In kasan.h,
shadow
is declared as a global variable. It should beextern
.
Thanks for pointing. I fixed that for the time being by changing user.ld
for x86 like:
. = 0x04000000;
__zeroed_pages = .;
.bss : ALIGN(0x1000) {
__heap = .;
. += 0x100000;
__heap_end = .;
. = ALIGN(0x1000);
__shadow = .;
. += 0x24000;
__shadow_end = .;
. = ALIGN(0x1000);
__bss = .;
*(.bss);
*(.bss.*);
__bss_end = .;
} :bss
This didn't fix the problem though. After lots of experimentation, I feel this is due to something external as even keeping ONLY a DBG
statement is inducing a page fault.
- What if
__asan_*
is called when it accesses the shadow memory? I guess you need a boolean variable (something likebool kasan_temporarily_disabled
) to ignore such memory accesses.
From what I could understand, the paper suggests marking the shadow memory region as "bad". I am not fully sure why they do that. Will the compiler also insert hook functions for shadow memory accesses?
Will the compiler also insert hook functions for shadow memory accesses?
I checked the OP-TEE's implementation but it seems we don't have to care about that since they do nothing for that, like using a function attribute. Just ignore that comment.
Kernel Address Sanitizer (KASAN) is a runtime memory error (e.g. use-after-free) checker. While it is "kernel" address sanitizer, we can use it in the userspace.
Briefly speaking, when KASAN is enabled, the compiler inserts code to call hook functions (
__asan_store8_noabort
) before each memory access (e.g.*ptr = 1;
). KASAN runtime (what we need to implement) is responsible for tracking how each memory bytes are valid.You don't need to implement as described in the paper. Just use it as memory access hooks.
Enabling KASAN
Add the following compiler options to
$CFLAGS
:Implementation Plan
Implement the KASAN runtime in
libs/common
.Furthermore, you need to update
shadow
inmalloc
,free
, and functions written in assembly likememcpy
.Good References