Closed s3rj1k closed 5 years ago
I'll need to play with it a bit, but this isn't going to be a LXD bug, repquota
shows we set the quota properly and the kernel shows it applied through df
, so any issue with accounting/enforcement after that would be a kernel bug.
@stgraber LXD snap version works with this disk layout and configs.
And a container without a quota doesn't hit that error?
File too large
sounds like you may be hitting a prlimit/ulimit rather than it being a quota issue, which would explain why you would see something different between snap and manual build as one is started through systemd and our wrapper while the other isn't, possibly leading to different limits being applied.
Inside CT:
root@test:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7730
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I am actually using systemd to unit to start LXD
[Unit]
Description=LXD - main daemon
After=network-online.target openvswitch-switch.service lxcfs.service lxd.socket
Requires=network-online.target lxcfs.service lxd.socket
[Service]
EnvironmentFile=-/etc/environment
# ExecStartPre=/usr/lib/x86_64-linux-gnu/lxc/lxc-apparmor-load
ExecStart=/usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log
ExecStartPost=/usr/bin/lxd waitready --timeout=600
KillMode=process
TimeoutStartSec=600s
TimeoutStopSec=30s
Restart=on-failure
LimitNOFILE=infinity
LimitNPROC=infinity
LimitFSIZE=infinity
TasksMax=infinity
[Install]
Also=lxd-containers.service lxd.socket
added LimitFSIZE=infinity, still seeing error
Actually yes, without quota still have this error, WTF?
setting
LimitAS=infinity
LimitCORE=infinity
LimitCPU=infinity
LimitDATA=infinity
LimitFSIZE=infinity
LimitLOCKS=infinity
LimitMEMLOCK=infinity
LimitMSGQUEUE=infinity
LimitNICE=infinity
LimitNOFILE=infinity
LimitNPROC=infinity
LimitRSS=infinity
LimitRTPRIO=infinity
LimitRTTIME=infinity
LimitSIGPENDING=infinity
LimitSTACK=infinity
TasksMax=infinity
does not help
disabled apparmor, still seeing error
@stgraber this is shiftfs related. removing /lib/modules/5.3.0-19-generic/kernel/fs/shiftfs.ko fixed above error with large file.
Now the questing is how do I disable shiftfs property, blacklisting module did not help last time I tried.
Also there are no shiftfs settings in LXD itself.
@stgraber I can confirm that quota works as expected without shiftfs
.
Why on earth this is enabled by default on all kernels and LXD and in snap is disabled?
LXD uses shiftfs whenever available, though due to kernel issues that are still being worked on, we have a patch in the snap that adds a knob to control it, disabling it by default and eventually enabling it if we're running on a known good kernel.
So sounds like this issue has nothing to do with project quotas but instead with handling of a particular filesystem operation. We'll need @brauner to look into that one and see if it's something we've fixed already in shiftfs or that needs extra fixing.
Closing as there isn't any action for us to pursue in LXD, but will put it on @brauner's todo to sort this out at the shiftfs level.
diff --git a/fs/shiftfs.c b/fs/shiftfs.c
index 55bb32b611f2..49b7777dde22 100644
--- a/fs/shiftfs.c
+++ b/fs/shiftfs.c
@@ -2045,6 +2045,7 @@ static int shiftfs_fill_super(struct super_block *sb, void *raw_data,
err = -EINVAL;
goto out_put_path;
}
+ sb->s_maxbytes = MAX_LFS_FILESIZE;
inode = new_inode(sb);
if (!inode) {
This should fix it.
@brauner thanks, that was quick :) Can you get a test kernel for @s3rj1k to validate that his stuff works fine? And do the usual sending of the fix through the usual Ubuntu channels :)
One of these days we'll actually have shiftfs behave the way we want it!
@stgraber @brauner Thanks )
I am willing to test this out as soon as I get test kernel, also hoping that the fix will be on both LTS and non LTS Ubuntu distributions.
I think having Kinderkrankheiten like this is pretty normal for something like shiftfs. I'm actually more and more happy since it also allowed us to find bugs in other filesystems. I'm pretty sure that our overlayfs changes should be upstreamed but there's only so hours in one day. :)
@s3rj1k building a kernel atm.
With my fix:
Script started on 2019-10-22 20:37:32+0000
root@b3:~# fallocate --length 5g aaaa
root@b3:~# du -sh ./aaaa
5.0G ./aaaa
root@b3:~# rm aaaa
root@b3:~# exit
Script done on 2019-10-22 20:37:50+0000
Here is a kernel for you to test: https://drive.google.com/open?id=15pLG3FAE52h6njRCfzgkZnG3tLen4ct0
@s3rj1k ^^
@brauner kernel fixes File too large
error but breaks quota enforcement
as seen in screen I can create file larger then available quota.
Same kernel but without shiftfs
quota works fine.
@stgraber, any idea about the quota stuff for the underlay?
I can reproduce this on non-shiftfs as well though.
@brauner did you do tune2fs -O project,quota -Q prjquota /dev/sda1
on fs before mounting it with prjquota
?
quota for ext4 works only when enabled by tune2fs and mounted with proper mount option
No, I don't think so. @stgraber knows the quota stuff better so I'll let him comment first.
Might be as simple as:
commit 69a28dcd8c22b3afb71ba867837f508e14424910 (HEAD -> shiftfs_fallocate, origin/shiftfs_fallocate)
Author: Christian Brauner <christian.brauner@ubuntu.com>
Date: Wed Oct 23 00:16:17 2019 +0200
shiftfs: drop CAP_SYS_RESOURCE to avoid overriding disk quotas
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
diff --git a/fs/shiftfs.c b/fs/shiftfs.c
index 49b7777dde22..81a73b87c395 100644
--- a/fs/shiftfs.c
+++ b/fs/shiftfs.c
@@ -2046,6 +2046,8 @@ static int shiftfs_fill_super(struct super_block *sb, void *raw_data,
goto out_put_path;
}
sb->s_maxbytes = MAX_LFS_FILESIZE;
+ /* Don't override disk quota limits or use reserved space. */
+ cap_lower(sbinfo->creator_cred->cap_effective, CAP_SYS_RESOURCE);
inode = new_inode(sb);
if (!inode) {
@brauner I can check later today (in morning by CEST) in my setup if you prepare updated kernel.
@s3rj1k sure, will link to a new kernel here.
@brauner yeah, enabling project quotas needs some effort, the tooling for it is rather awful :) LXD dir containers on a filesystem that's had the tune2fs property set and is mounted with the prjquota option will normally do the right thing then.
@brauner yeah, enabling project quotas needs some effort, the tooling for it is rather awful :) LXD dir containers on a filesystem that's had the tune2fs property set and is mounted with the prjquota option will normally do the right thing then.
My Ubuntu vm doesn't support it and I got pretty mad when I realized that adding prjquota
as a mount option in /etc/fstab
forced me to reboot into rescue mode and remount my rootfs rw, so I could edit my fstab...
@brauner Confirming that latest kernel works correctly with quotas and shiftfs. Yay))
When this fix is expected to be publicly available? (To double test this on public kernel)
@brauner Confirming that latest kernel works correctly with quotas and shiftfs. Yay))
Excellent.
When this fix is expected to be publicly available? (To double test this on public kernel)
In a couple of weeks. I'll give you a link to the launchpad bug here.
@s3rj1k here are the launchpad bugs to track:
Once a kernel will be proposed it'll be mentioned in these bugs. You can subscribe to them to get notified when that happens.
@brauner Thanks, hoping to see this fix soon )
@stgraber Small not related question. How can I programmatically match container with project quota ID, to get usage statistics?
The id is 10000 + id of container as you'd find them in lxd sql global "SELECT id, name FROM instances;"
Note that the used space is also reported through lxc info NAME
.
@stgraber Thanks, lxc info NAME
sadly does not report correct usage.
Sometimes Disk usage:
paragraph disappears from lxc info NAME
output.
Other times it takes a lot of time to sync actual disk usage with what reports lxc info NAME
.
@stgraber Should I do separate Issue for lxc info NAME
bug?
Yeah, that'd be good to track down, I would have expected data to be returned just fine for dir backend.
Ubuntu
19.10
lxc info
:mount | grep prjquota
:tune2fs -l /dev/md1 | grep features
:repquota -avugPs
:lxc exec test -- sh -c "df -h"
:Project quota on Dir storage backend backed by ext4 filesystem on soft-raid (mdadm) device works incorectly when this configuration is used in non-snap build of LXD.
To reproduce:
Same issue can be reproduced in Ubuntu 18.04.3 with kernel 5.0.x in similar configuration
Any ideas how to fix this?