tannal / ohmywork

0 stars 0 forks source link

SegFault, page table, os memory management #25

Open tannal opened 3 months ago

tannal commented 3 months ago

Do it by hand

A simple program which will trigger segmentation fault.

int main() {
    return *(int*)(0)
}
gdb ./a.out
start
info proc
c
sudo stackcount-bpfcc -i 2 -p 16780 --debug "*sig_fault*"

image

tannal commented 3 months ago

Page Table is a Tree image

The kernel is the writer of page table, and the mmu is the reader of the page table.

How kernel map a virtual address?

  1. use the virtual address to get page table entry (pte is a pointer to the entry in the page table)
  2. set the *pte = physcial | other bits

In a system with mmu, when cpu access an addess, the mmu will traverse the page table to get the physcial address. If the pte entry is invalid, then mmu triggers a page fault.

static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
                   struct pt_regs *regs)

Where the far is the fault address register and the esr is exception syndrome register. far contains virtual address and esr contains the reason for the fault.

tannal commented 3 months ago

Kernel Page Table Isolation.

Switch Page Table is expensive. Context switch is expensive. System Call is expensive.

sparc, m68k

tannal commented 3 months ago

When the kernel is booting, the page table is not enable initially. The kernel first need to map the start of the virtual memory one by one. Otherwise the CPU will immediately trigger page fault, because there is no page mapping to physical memory.

Then the kernel map the rest of itself after the user memory. Do a long jump to the kernel memory.

Take Linux for an example


static struct addr_marker address_markers[] = {
#ifdef CONFIG_KASAN
    { KASAN_SHADOW_START,   "Kasan shadow start"},
    { KASAN_SHADOW_END, "Kasan shadow end"},
#endif
    { MODULES_VADDR,    "Modules" },
    { PAGE_OFFSET,      "Kernel Mapping" },
    { 0,            "vmalloc() Area" },
    { FDT_FIXED_BASE,   "FDT Area" },
    { FIXADDR_START,    "Fixmap Area" },
    { VECTORS_BASE, "Vectors" },
    { VECTORS_BASE + PAGE_SIZE * 2, "Vectors End" },
    { -1,           NULL },
};

image

tannal commented 3 months ago

Memory Allocator

void alloc()

void free()

bitmap O(n) linked list O(1)

image

parallel?

Real world application memory allocator

jemalloc by facebook

bmalloc for webkit by Filip Pizlo and others.

tannal commented 3 months ago

How does kernel create new processes?

exec("/bin/ls", argv)
  1. load executable to memory
  2. map the kernel pages to the new process address space
  3. map the user pages to the new process address space
  4. allocate program stack
  5. push program arguments on the stack
  6. switch page tables

image

xv6

int
allocuvm(pde_t *pgdir, uint oldsz, uint newsz)
int
loaduvm(pde_t *pgdir, char *addr, struct inode *ip, uint offset, uint sz)
tannal commented 2 months ago

There is an excellent tool called pagemon made by a kernel developer. https://github.com/ColinIanKing/pagemon/ You can inspect all the pages in a process with the root permission. And see the bit flips in the real times, which is awesome.