Closed lundman closed 11 years ago
Well, who knew you can make files with both lseek()+write() holey, and using ftruncate(+offset) to "extend".
Simple test code;
#include <stdio.h>
#include <fcntl.h>
int main(int argc, char **argv)
{
int fd;
char buffer[64*1024];
int i;
unlink("/BOOM/test");
fd = open("/BOOM/test", O_RDWR|O_TRUNC|O_CREAT, 0600);
if (fd < 0) exit(0);
strcpy(buffer, "Hello world\n");
for (i = 1; i < 10; i++) {
ftruncate(fd, (off_t)i*(rand()%10000));
if (write(fd, buffer, sizeof(buffer)) != sizeof(buffer))
perror("write");
}
close(fd);
exit(0);
}
Will generally panic. Note the unlink()
before open()
appears to be required, if open truncates the file (O_TRUNC) the bug does not appear to happen. So it requires the creation of the file to trigger.
Ok, buffer growth on OSX is like
512 10240 20480 30208 ...
and because that is not a power-of-two size, the dn_datablkshift
variable is set to 0, and the code containing the panic is triggered.
Compared to Linux ZOL, the buffer grows
512 10240 65536 131072
You can see only the first growth is a non-power-of-two.
This boils down to that OSX has all calls to zfs_grow_blocksize()
from zfs_extend()
. This blocksize grower algorithm is poor indeed.
But on Linux, only the first growth (10240) is triggered from zfs_extend()
. All other come from
zfs_write()
's
if (rl->r_len == UINT64_MAX) {
uint64_t new_blksz;
if (zp->z_blksz > max_blksz) {
ASSERT(!ISP2(zp->z_blksz));
new_blksz = MIN(end_size, SPA_MAXBLOCKSIZE);
} else {
new_blksz = MIN(end_size, max_blksz);
}
}
For some reason, Linux's first call will have rl->r_len == UINT64_MAX
but on OSX this never triggers, as rl->r_len == 65536
Now, once the OSX problem has been tracked down, I would guess that with a carefully crafted recordsize (avoiding power of 2) and using extend, all ZFS versions can be made to panic, due to zfs_extend
's poor blocksize growth logic.
In the end, zfs_rlock_write will signal that a blocksize growth is required by setting r_len
to UINT64_MAX
, this was missing on OSX due to ZOL/ZSB usage. Once ported over, the problem goes away.
It is interesting to note that zfs_extend
growth, which is not in power of 2, is undesirable and can possibly still trigger the soft panic. Perhaps zfs_extend
blocksize growing should be encouraged to be in power of 2.
fsx
and fstorture
now work without panic. (although, mmap is still at times, incorrect)
https://github.com/zfs-osx/zfs/commit/312697f32576697081bd3c6e1b5ae6784883e487
Cleanest run that produces said panic appears to be:
Which appears to be to create file, extend it to 27812, then write 21111 bytes. (size=28160 access=24136+21111). Unsure who/what seeks to 24136.