mikaku / Fiwix

A UNIX-like kernel for the i386 architecture
https://www.fiwix.org
Other
407 stars 32 forks source link

Fault on accessing minimal user stack with syscall #22

Closed rick-masters closed 1 year ago

rick-masters commented 1 year ago

If a user process tries to access a stack page for the first time using a system call, it may cause a page fault.

This problem was seen with the make program while building linux but the setup to reproduce the problem would be impractical, so I created a small test program. It took quite a bit of investigation and testing to isolate the specific problem.

The following code can be used to trigger the problem. Admittedly, this test is not something anyone would run into casually, but I do believe it this is "valid" code that should work and represents what I think make was doing when it faulted.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <errno.h>

int main() {
        char filename[3670] = "stacktest";
        struct stat statinfo;

        /*
        puts("hello");
        __asm__ (
                        "mov $0x6A,%eax\t\n"
                        "mov $0xbffff00a,%ebx\t\n"
                        "mov $0xbfffefc8,%ecx\t\n"
                        "int $0x80"
                );
        printf("my size is %u\n", statinfo.st_size);
        */
        printf("filename is 0x%08x\n", (unsigned int) filename);
        printf("statinfo is 0x%08x\n", (unsigned int) &statinfo);
}

You may need to do perform some manual adjustments to make this work. First, compile and run the program as is:

cc stacktest.c -o stacktest
./stacktest
filename is 0xbffff00a
statinfo is 0xbfffefc8

Note the address of filename and statinfo. It is important that filename and statinfo are in different memory pages. You may need to adjust the size of filename in order to make this happen.

Now change the assembly code (if necessary) to match those addresses for setting %ebx and %ecx, respectively, and uncomment that code.

You should be able to run the program now and get the correct size of the file:

cc stacktest.c -o stacktest
./stacktest
hello
my size is 586769
filename is 0xbffff00a
statinfo is 0xbfffefc8

The assembly code is executing a stat system call and is the equivalent of stat(filename, &statinfo);.

Now, edit the code and remove the line puts("hello");. Compile and run again. For me, this produces a segfault.

The issue is that the stack page for statinfo is not mapped in. Normally, any access of an unmapped stack page from user land would cause the page to be mapped in by a page fault. The puts("hello"); call triggers this mechanism when it pushes a call frame onto the stack. However, when that is removed, the program is using a system call to write to that memory for the first time. In this case, the fault occurs in the kernel and it appears the normal mechanism for growing the stack does not trigger and a page fault terminates the program instead.

It seems there are a couple of ways to resolve this. At first, I thought the problem was simply that a lack of sufficient stack space was being mapped for the user program. I developed a solution with that in mind. I wrote a change in which the user program is mapped a significant chunk of free stack from the start of execution. (See attached PR). This indeed resolves the problem.

After trying to create a test case I came to understand the current mechanism for growing the user stack dynamically on a fault, which examines the user process %esp register. It might be possible to create a similar mechanism for growing the stack when a fault occurs during a system call. On the other hand, I don't see a downside to just setting up more stack memory in the vma to begin with.

mikaku commented 1 year ago

I've tried your example program with several values but I'm unable to reproduce the segfault. Can you, please, paste all the kernel output (registers, etc.) from the segmentation fault?

rick-masters commented 1 year ago

Test program attached stacktest.gz

mikaku commented 1 year ago

I think I know what is happening here.

The function verify_address() doesn't do a stack expansion when the address being checked belongs to the user stack, which would save the inevitable page fault. Having the option CONFIG_LAZY_USER_ADDR_CHECK enabled or disabled doesn't make any change on this.

Solution:

The following patch belongs to the second point above. For now, you don't need the patch in verify_address() because your kernel has CONFIG_LAZY_USER_ADDR_CHECK enabled.

diff --git a/mm/fault.c b/mm/fault.c
index 79f624b..a336587 100644
--- a/mm/fault.c
+++ b/mm/fault.c
@@ -266,6 +266,16 @@ void do_page_fault(unsigned int trap, struct sigcontext *sc)

                /* in kernel mode */
                } else {
+                       struct sigcontext *usc;
+
+                       /* does it look like a user stack address? */
+                       usc = (unsigned int *)sc->esp + 34;     /* user sigcontext before the page fault */
+                       if(cr2 >= (usc->oldesp - 32) && cr2 < PAGE_OFFSET) {
+                               if((!page_not_present(vma, cr2, usc))) {
+                                       return;
+                               }
+                       }
+
                        dump_registers(trap, sc);
                        show_vma_regions(current);
                        do_exit(SIGTERM);

Please, let me know if it works for you. I'll submit a complete patch soon.

rick-masters commented 1 year ago

Yes that works. Thank you. I'd recommend this change to avoid a pointer conversion warning:

                        usc = (struct sigcontext *) ((unsigned int *) sc->esp + 34);    /* user sigcontext before the page fault */
mikaku commented 1 year ago

I'd recommend this change to avoid a pointer conversion warning:

Thanks, I'll take it into account when sending the definitive patch.