mit-pdos / xv6-riscv

Xv6 for RISC-V
Other
6.6k stars 2.38k forks source link

A bug in the FS logging layer of xv6? #123

Closed Kristoff-starling closed 2 years ago

Kristoff-starling commented 2 years ago

Hello, I'm recently studying the principle of xv6 and benefits a lot from the explicit comments and the textbook. I think that there may be a bug in the logging layer in the file system. I'll describe the details below:

This is the code of install_trans() in /kernel/log.c :

static void
install_trans(void)
{
  int tail;

  for (tail = 0; tail < log.lh.n; tail++) {
    struct buf *lbuf = bread(log.dev, log.start+tail+1); // read log block
    struct buf *dbuf = bread(log.dev, log.lh.block[tail]); // read dst
    memmove(dbuf->data, lbuf->data, BSIZE);  // copy block to dst
    bwrite(dbuf);  // write dst to disk
    bunpin(dbuf);
    brelse(lbuf);
    brelse(dbuf);
  }
}

the line bunpin(dbuf) unpins dbuf in the buffer cache so that this block can be evicted out of the memory after brelse(dbuf). The code works correctly if everything goes normally: there is a bpin() in log_write() which pins the block in the buffer cache by incrementing its reference count and the bunpin() here decrements it.

However, if a crash happened during the installation of a transaction, after xv6 reboots, it would call recover_from_log() in initlog() and call install_trans to write logging blocks into their home blocks. At this moment, there isn't a bpin() before, so the bunpin() here will decrement the reference count to zero and in brelse() the reference count will experience an underflow and become UINT_MAX since the refcnt is a uint type variable. As a result, the block will stay in the buffer cache "forever".

We can expose this bug by adding two lines in xv6:

After we fire up xv6 for the first time, we'll encounter a "crash", and after rebooting, xv6 will panic at panic("bug"). I made a disk image, fs-bug.img, which records a crashed xv6 using the first step. If we load this image, we'll directly get a buggy scene (and can be detected by the assertion in brelse()). The disk image is available here.

A possible solution for this problem is to add a bpin() in read_head() so that the bunpin() during the recover procedure will have corresponding bpin().

I hope that my suggestions will be helpful. Welcome for further discussions, thanks!

Kristoff-starling commented 2 years ago

I'm sorry that in the latest version of xv6, this bug has been fixed by an if statement

if (recovering == 0)
    bunpin(dbuf);

I refered to an old version (2020) of xv6 and mistakenly proposed the issue. I apologize for my carelessness.