mikaku / Fiwix

A UNIX-like kernel for the i386 architecture
https://www.fiwix.org
Other
407 stars 32 forks source link

lseek past end of file does not zero fill hole #24

Closed rick-masters closed 1 year ago

rick-masters commented 1 year ago

If you lseek past the end of a file and write then the hole that was skipped should be filled with zeros but is not.

Here is documentation of that expected behavior: https://www.gnu.org/software/libc/manual/html_node/File-Position-Primitive.html

You can set the file position past the current end of the file. 
This does not by itself make the file longer; lseek never changes the file. 
But subsequent output at that position will extend the file. 
Characters between the previous end of file and the new position are filled with zeros.

This test program illustrates the problem:

#include <unistd.h>
#include <fcntl.h>

int main() {
        char buf[] = "testdata";
        int myfile = open("mytest", O_WRONLY | O_CREAT);
        lseek (myfile, 16, SEEK_SET);
        write(myfile, buf, 8);
        close(myfile);
}
$ cc lseekhole.c -o lseekhole
$ ./lseekhole
$ od -tx1 -Ax mytest
000000 09 2e 66 69 6c 65 09 22 6c 73 65 65 6b 68 6f 6c
000000 74 65 73 74 64 61 74 61

Expected output (which linux/gcc produces):

000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000 74 65 73 74 64 61 74 61
rick-masters commented 1 year ago

Here is my somewhat ugly workaround:

diff --git a/kernel/syscalls/write.c b/kernel/syscalls/write.c
index 969f952..7468b3d 100644
--- a/kernel/syscalls/write.c
+++ b/kernel/syscalls/write.c
@@ -38,6 +38,18 @@ int sys_write(unsigned int ufd, const char *buf, int count)
                return -EINVAL;
        }
        i = fd_table[current->fd[ufd]].inode;
+       if (fd_table[current->fd[ufd]].offset > i->i_size) {
+               int extension = fd_table[current->fd[ufd]].offset - i->i_size;
+               fd_table[current->fd[ufd]].offset = i->i_size;
+               i->fsop->lseek(i, i->i_size);
+               int errno;
+               while (extension--) {
+                       errno = i->fsop->write(i, &fd_table[current->fd[ufd]], "", 1);
+                       if (errno != 1)
+                               return errno;
+               }
+       }
+
        if(i->fsop && i->fsop->write) {
                errno = i->fsop->write(i, &fd_table[current->fd[ufd]], buf, count);
 #ifdef __DEBUG__
rick-masters commented 1 year ago

lseekhole binary attached. lseekhole.gz

rick-masters commented 1 year ago

I should note that further testing of my workaround revealed a regression in another program (building tcc produced an invalid executable) so it looks like I will not be able to use it.

mikaku commented 1 year ago

Not reproducible here, even using the supplied binary.

[(root) ~]# cat lseekhole.c 
#include <unistd.h>
#include <fcntl.h>

int main() {
        char buf[] = "testdata";
        int myfile = open("mytest", O_WRONLY | O_CREAT);
        lseek (myfile, 16, SEEK_SET);
        write(myfile, buf, 8);
        close(myfile);
}

[(root) ~]# ./lseekhole 
[(root) ~]# od -tx1 -Ax mytest             
000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000010 74 65 73 74 64 61 74 61
000018
mikaku commented 1 year ago

Can you, please, paste here the complete QEMU command you are using? Perhaps I'll able to reproduce this problem if I use the same setup as you.

rick-masters commented 1 year ago
qemu-system-i386 \
    -monitor stdio \
    -debugcon file:debugcon.log \
    -vnc 10.0.1.171:1,password \
    -drive file=images/FiwixOS-3.1-i386.raw,format=raw,cache=writeback,index=0 \
    -kernel fiwix-git-latest/fiwix \
    -append "root=/dev/hda2" \
    -m size=256

It is qemu 4.2.1.

Perhaps something in libc works differently in FiwixOS 3.2. I can see if that makes a difference (although that wouldn't help me).

mikaku commented 1 year ago

It is qemu 4.2.1.

I don't see any special parameter that could led to this change in lseek() behavior. I'm, using QEMU emulator version 6.2.0, with the following setup:

qemu-system-i386 \
        -drive file=$IMGS_DIR/${FLOPPY_IMG},format=raw,if=floppy,index=0 \
        -drive file=$IMGS_DIR/fiwix-ext2-1GB.img,format=raw,if=ide,cache=writeback,index=0 \
        -boot a \
        -net none \
        -m 256 \
        -enable-kvm \
        -machine pc \
        -cpu 486 \
        -chardev pty,id=pciserial \
        -device pci-serial,chardev=pciserial \
        -serial pty \
        -debugcon stdio

Perhaps something in libc works differently in FiwixOS 3.2. I can see if that makes a difference (although that wouldn't help me).

FiwixOS 3.1 came with Newlib 3.2 and FiwixOS 3.2 uses Newlib 4.2. Sincerely, I don't know all the changes they made between these two releases.

rick-masters commented 1 year ago

I've tested with qemu 4.2.1 and 7.1.50. I've tested with qemu i386 and x86_64. I've tested with FiwixOS 3.1 and 3.2. I've built Fiwix with a new gcc and an old one. I've used kvm and no kvm. I've used graphics and serial. In nearly every case, the problem reproduces. One time (!) I got it to fill with zeros but I have been unable to get that to happen again with the same setup.

mikaku commented 1 year ago

This is really frustrating. OK, I'll continue trying with different setups until I can reproduce it. Thanks for the effort you put on it.

mikaku commented 1 year ago

Following your recommendation on IRC (#bootstrappable), I was able to reproduce the problem after applying the following modifications in the test program:

#include <unistd.h>
#include <fcntl.h>

int main() {
    int n;
    char buf[] = "testdata";
    char filename[20];

    for(n = 0; n < 1000; n++) {
        sprintf(filename, "mytest%d", n);
            int myfile = open(filename, O_WRONLY | O_CREAT);
            lseek (myfile, 16, SEEK_SET);
            write(myfile, buf, 8);
            close(myfile);
    }
}

I'll work on it. Thanks.

mikaku commented 1 year ago

Good news,

I've found what is happening and where is the problem. This is a serious bug because it affects the synchronization between the buffer cache (block contents) and the page cache (file contents).

Basically, all happens here:

https://github.com/mikaku/Fiwix/blob/f2d318dc9b6db5c6fab88f3a701612c57ee0f8d2/fs/ext2/file.c#L98-L115

What is happening

When you want to put something into a file, the kernel obtains the block where it will write it:

https://github.com/mikaku/Fiwix/blob/f2d318dc9b6db5c6fab88f3a701612c57ee0f8d2/fs/ext2/file.c#L100

If this block is new for the file (which is true in our case since the file was empty), the function ext2_bmap() correctly sets to zero the buffer that contains such block:

https://github.com/mikaku/Fiwix/blob/f2d318dc9b6db5c6fab88f3a701612c57ee0f8d2/fs/ext2/inode.c#L254

This means that, when buffer (that contains the string testdata) is copied into buf->data at offset 16, all bytes before this offset are all zeros:

https://github.com/mikaku/Fiwix/blob/f2d318dc9b6db5c6fab88f3a701612c57ee0f8d2/fs/ext2/file.c#L110

The problem

Once the contents of buffer are good, the kernel needs to synchronize this change in the page cache:

https://github.com/mikaku/Fiwix/blob/f2d318dc9b6db5c6fab88f3a701612c57ee0f8d2/fs/ext2/file.c#L111

The problem here is that the kernel finds a page that matches with the same inode and offset, belonging to another file what was used recently. This explains to me why I was unable to reproduce it right before booting the system. The only way to reproduce this issue was recompiling the program, since compilation creates some temporary files that are removed latter. Then, when the test program creates the file myfile the kernel assigns it a reused inode.

I hope to provide a fix soon.

mikaku commented 1 year ago

@rick-masters, please check this patch to see if it fixes this issue.

rick-masters commented 1 year ago

Yes, this fixes the issue. Thank you very much!

mikaku commented 1 year ago

Perfect!, I'll fix the same bug in the Minix filesystem (v1 and v2).

mikaku commented 1 year ago

Thank you.