NVSL / linux-nova

NOVA is a log-structured file system designed for byte-addressable non-volatile memories, developed at the University of California, San Diego.
http://nvsl.ucsd.edu/index.php?path=projects/nova
Other
421 stars 118 forks source link

nova error with some pwrite calls #102

Open hayley-leblanc opened 3 years ago

hayley-leblanc commented 3 years ago

Hi,

I've encountered some weird behavior in NOVA that can result in nova error messages popping up. Unlike the other issues I've raise recently, this is not crash related and appears when we just run NOVA normally. We are working on a fuzzer to generate tests for our crash consistency testing framework and came across this while using the fuzzer.

The root of the behavior seems to be calls to pwrite where the count parameter specifying the length is longer than the provided buffer. When we run a test with such a pwrite call, we usually see something like the following in dmesg after running the program:

[  263.238210] nova: do_nova_inplace_file_write ERROR!: 00000000badafff9, bytes 114823168, copied 36736
[  263.258257] nova: Current epoch id: 0
[  263.258299] nova error:
[  263.258302] ERROR: no inode log page available: 1 -28
[  263.258729] nova error:
[  263.258730] ERROR: no inode log page available: 0 -22

It's manifested in a couple different ways, some of which cause errors at unmount time or in subsequent system calls. Here's an example workload where I've been able to get it to manifest reliably on a 1-core VM. I am running this example on an empty instance of NOVA.

r0 = open(“foo”, O_WRONLY | O_CREAT)
pwrite(r0, “s”, 1, 0)
pwrite(r0, "f747a38753edd0b85e2d66fa8d16a00d0ea2a3d2becb2cfc704f7ae1d678e4b10296ed0bc82ac9ee8fb2c72d04ff1b51c0fe4f4ef64e14baa3b439649058464ca7f9a9fd1129353685b8c0e685", 0x29b8ff68, 0)

If I unmount the system immediately after running this, I see nova error appear on the terminal and get the following messages in dmesg:

[  263.238210] nova: do_nova_inplace_file_write ERROR!: 00000000badafff9, bytes 114823168, copied 36736
[  263.258257] nova: Current epoch id: 0
[  263.258299] nova error:
[  263.258302] ERROR: no inode log page available: 1 -28
[  263.258727] nova: Error saving inode list: -28
[  263.258729] nova error:
[  263.258730] ERROR: no inode log page available: 0 -22
[  263.259126] nova: Error saving blocknode mappings: -22

Remounting the system seems to go fine and foo is present with the expected contents, but we've noticed in the fuzzer that sometimes it seems to cause other issues, like not being able to create a directory following the pwrite call. We're working on finding some more examples that exhibit more clearly buggy behavior and I'll add them when we do.

Thanks!

hayley-leblanc commented 3 years ago

Here's another example we found that exhibits more clearly buggy behavior. Here's a zip file with two programs that reproduce it: programs.zip. fuzzer.c was generated by our fuzzer and simpler.c is a slightly simplified version of the same program.

Here's the contents of the programs in pseudocode. In the programs, buffer is 368 bytes long.

fd0 = open("file0")
fd1 = open("file0")
write(fd1, buffer, 0xffffffadul)
fd2 = open("file0")
fd3 = open("file0")
write(fd3, buffer, 0xffffffadul)

We tried removing some of the extra open calls but apparently they are all necessary to reproduce the bug.

After running this program on NOVA, running df says that the NOVA instance is 100% full, and trying to, e.g, create a directory fails. On a VM with one core, the write calls fail and say there is no space left on disk. For some reason, on more cores, the system calls all succeed but the file system is still full and we can't create new files/directories. We tried it on both 128MB and 1GB of emulated PM and it occurred on both.

Thanks!

Andiry commented 3 years ago

It looks to me that since the pwrite() specifies a large count, NOVA allocates the required NVMM data pages, and hence run out of log pages, as they share the NVMM space. Is that the case? Is write/pwrite with count larger than the buffer size an undefined behavior?

hayley-leblanc commented 3 years ago

Apologies for the delayed response. I believe what you describe is correct, where NOVA allocates the number of requested pages and runs out of space. It's not immediately clear to me whether calling write/pwrite with a count larger than the buffer size is undefined behavior here.

POSIX does define what should occur when a write call attempts to exceed the size of the physical medium here: "If a write() requests that more bytes be written than there is room for (for example, the file size limit of the process or the physical end of a medium), only as many bytes as there is room for shall be written." The man page for write basically says that writing beyond the maximum file size is implementation defined. I'm not sure whether the POSIX spec is referring to providing a buffer larger than the size of the storage device, or a greater count value, or both.

I did a bit more experimentation with the example and found that file0 ends up having a size that is the same size as the provided buffer, eg. using a buffer of 15000 bytes makes file0 have size 15000. This seems potentially incorrect based on the POSIX spec combined with the fact that the NVMM device doesn't have any free space. It would make sense to me if the file just ended up being huge and taking up most/all of the NVMM, but that doesn't seem to be what's happening - what's the expected behavior for NOVA in this case? Also, deleting the file does not free up the space; I still can't, e.g. create a new file even after file0 has been deleted.

Anyway, I'd be interested in hearing your thoughts on this if you get a chance. Even if this is undefined behavior and legal by POSIX/Linux standards, it seems undesirable.

Andiry commented 2 years ago

I have pushed a commit to cleanup the incomplete write in this case. Try if it helps.