btrfs / btrfs-todo

An issues only repo to organize our TODO items
21 stars 2 forks source link

fallocate(existing file, LARGE_AMOUNT) can fail, even if we would only allocate a small subset of LARGE_AMOUNT #7

Open josefbacik opened 4 years ago

josefbacik commented 4 years ago

@cmurf brought another awesome thing to my attention, sd-homed does this thing where it has a file backed loopback device that gets the LUKS treatment on. I'm not entirely clear how that part works, but what they're doing is essentially pre-allocating a file, and then mounting it as the home dir. When they unmount they fstrim to clear any excess space. The next time they mount, the fallocate the whole range again. This sometimes fails with ENOSPC, this is because our fallocate does

if (!(mode & FALLOC_FL_ZERO_RANGE)) {
        ret = btrfs_alloc_data_chunk_ondemand(BTRFS_I(inode), alloc_end - alloc_start);
        if (ret < 0)
                return ret;
}

so when we do fallocate(existing_file, 400gib), we try to make sure we have 400gib free to use on the fs.

The problem here is that we won't actually need 400gib, we'll only need a subset of that. So we ENOSPC out prematurely because we don't have this giant amount we may not need.

We need to push this reservation down into the loop where we do the qgroup reservation before we call btrfs_prealloc_file_range(), that way this can succeed when it should really have succeeded.

cmurf commented 4 years ago

Goffredo Baroncelli points to this 2017 thread as likely related: https://lore.kernel.org/linux-btrfs/798a9077-bcbd-076c-a458-3403010ce8ac@libero.it/

arvidjaar commented 4 years ago

we won't actually need 400gib, we'll only need a subset of that

that's not entirely true for CoW filesystem. As long as semantic of fallocate is "ensure we have space to store requested amount of data" in general we do need 400gib. It probably may be relaxed in special cases (like having C attribute, but even this does not fully eliminates CoW).

Although unless allocated space is actually exclusively reserved for this file, it can go away at any time. In this case it looks indeed rather pointless.