Stichting-MINIX-Research-Foundation / minix

Official MINIX sources - Automatically replicated from gerrit.minix3.org
Other
3.06k stars 994 forks source link

Bug in the alloc_contig() function. #86

Open lfogel opened 9 years ago

lfogel commented 9 years ago

I've been tracking a bug that randomly appears in the eMMC driver, and I think I've found where it is hiding: the alloc_contig() function in the libsys. The bug triggers a "start reading past drive size" warning message from the block driver when a user tries to read a partition. This function is used by the libblockdriver, which is linked to the ARM eMMC and SD (mmc) drivers, and maybe others. I don't know whether the bug appears in other drivers and plataforms (i.e. x86). Please, find attached a patch that modifies the "hello" driver. In order to reproduce the bug, load and unload the driver, repeatedly:

# while true; do
   service up /service/hello -dev /dev/hello
   cat /dev/hello
   echo
   service down hello
done

Occasionally, the hello_read() function won't show the "Hello, World!" message.

Subject: [PATCH] Reproduce a bug in alloc_contig().

Sometimes, the content of the memory pointed by alloc_contig() is lost.
This patch allows one to (occasionally) reproduce the problem.

---
 minix/drivers/examples/hello/hello.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/minix/drivers/examples/hello/hello.c b/minix/drivers/examples/hello/hello.c
index 791e20e..2b31d79 100644
--- a/minix/drivers/examples/hello/hello.c
+++ b/minix/drivers/examples/hello/hello.c
@@ -31,6 +31,7 @@ static struct chardriver hello_tab =
  * Note that this is not the regular type of open counter: it never decreases.
  */
 static int open_counter;
+static char *buf;

 static int hello_open(devminor_t UNUSED(minor), int UNUSED(access),
     endpoint_t UNUSED(user_endpt))
@@ -52,7 +53,6 @@ static ssize_t hello_read(devminor_t UNUSED(minor), u64_t position,
     u64_t dev_size;
     char *ptr;
     int ret;
-    char *buf = HELLO_MESSAGE;

     printf("hello_read()\n");

@@ -153,6 +153,11 @@ int main(void)
      * Perform initialization.
      */
     sef_local_startup();
+    buf = alloc_contig(2048, AC_ALIGN4K, NULL);
+    if (buf != NULL)
+       strcpy(buf, HELLO_MESSAGE);
+    else
+       printf("Out of memory.\n");

     /*
      * Run the main loop.
--
1.7.10.4
dcvmoole commented 9 years ago

This issue likely goes deeper than that. The alloc_contig(3) function is just a thin wrapper around mmap(2), and the actual issue most likely has something to do with the type of memory that VM allocates for these requests. I know that there have been issues with contiguously allocated memory on ARM, which have ultimately led to file systems no longer using this kind of memory for their buffer caches. I'm afraid I don't know any more details and I hope someone else can chime in here..