Open josefbacik opened 3 years ago
I have some prototypes to improve the heuristics, with tracking of the compression quality. The goal is to never need to use compress-force for mount and just compress + heuristics that will set the nocompress flag after more evaluation that happens now (and is know to be weak).
I'm kind of amused about the label 'good first issue', it takes a lot of thinking and evaluation to do it right, and there are always more ways to do it.
Yeah 'good first issue' may have been overstating it, but mechanically it's very simple, just need to get consensus on what the behavior is we expect, which is the harder part. But if we break it down to behaviors it should be straightforward, we want
This opens up a few questions that we need answered
Once we get those questions answered then the code is relatively simple. My opinion is
chattr
only sets COMPRESS or clears COMPRESS, it never sets NOCOMPRESS.
- chattr only sets COMPRESS or clears COMPRESS, it never sets NOCOMPRESS.
chattr (e2fsprogs) only got support for FS_NOCOMP_FL
six months ago (better late than never, it was added to ext2 in 1998). You want to remove that?
I'd prefer leaving the chattr bits as they are, not because they're an amazing model of user experience, but because they've had whatever meaning they have for decades now, changing them will break someone, and they aren't onerous to implement as they are. chattr supported the 'm' bit only recently, but people could have rolled their own chattr and used the bit on btrfs since 2011.
- An in memory only flag for "this file sucks at compressing" and stop using NOCOMPRESS for this.
I am in favor of that, though. The user-writable chattr bits are the bailiwick of the admin, and code shouldn't be flipping them unless explicitly asked to.
We have an xattr btrfs.compression
which takes a string argument, and is internally converted to bits. Can it be set to none
to mean "ignore mount option, don't compress this"? Deleting the attribute would set it to default, "do what the mount option says."
There's a lot of flexibility in the xattr, we can also specify compression level, whether to bypass the heuristics or force compression, or (insert future feature here). There are only two usable chattr bits, so they're not very expressive, and we've painted ourselves into a corner on what both of them mean.
Hello, the label of this issue is good first issue, so I want to apply a patch to it. The following are my thoughts.
chattr only sets COMPRESS or clears COMPRESS, it never sets NOCOMPRESS.
The fact is that chattr -c only clears the BTRFS_INODE_COMPRESS flag, it will not set BTRFS_INODE_NOCOMPRESS, chattr +m will set BTRFS_INODE_NOCOMPRESS as the inode flag, and the +m and +c flags cannot be set in the same inode at the same time.
The current situation is that if the file system is mounted with the compression-force option, it will compress the file anyway (unless the compressed size is larger than the original, but every time we write to the file, it will check whether it is It can be compressed. If there is any possibility that the file can be compressed, it will compress the data), but if you use the option compress instead of compress-force to mount the file system, chattr +m is valid and it will disable the compression of the file. I think this is reasonable, because usually the administrator will do the mounting work, other users use the file system, if the administrator explicitly uses the compression force option to mount the file system, all users should abide by the rules, but if the administrator mounts Only the compression option, the user can choose not to compress.
At present, if we use chattr +c set file to set compress, it will also add a compress xattr, but if we set chattr +m on the file, it will clear the compress attribute, it does not have xattr to indicate not to compress,
I think it is correct to distinguish between the BTRFS_INODE_NOCOMPRESS and "this file sucks at compressing", so we need to introduce another sign to indicate "this file sucks at compressing"
Now btrfs_inode→prop_compress only records which compression algorithm is used, not the compression level. I think we should also record the compression level, for example: prop_compress record algorithm in higher bit, and compression level lower bit.
The current heuristic uses the Shannon entropy of the original file to determine whether the file should be compressed. It any possible the further heuristics to tracking of the compression file Shannon entropy, I just wonder how to tracking of the compression quality.
The following is patch summary:
We add a btrfs property that explictly sets NOCOMPRESS on the file.
An in memory only flag for "this file sucks at compressing" and stop using NOCOMPRESS for this.
This sounds reasonable to me. I imagine we'll discuss it more on list, but make sure to lay the mechanics out in your cover letter for the patches so reviewers know what the plan is. I look forward to seeing the patches.
Hi, I have pushed a patch about this issue, [Related Patches] https://www.spinics.net/lists/linux-btrfs/msg120300.html
It contains 5 parts and a patch summary description ([0/5]), but I forgot to set each part of the patch to in-reply-to="[0/5]" when sending the email, I should Clear this error? Resend. Or just keep it as it is.
We have a way to disable compression per files in btrfs, but there's a weird interaction with -o compress-force. You can set no compression with
chattr
per file and per directory, but we specifically ignore this setting if you are mounted with -o compress-force. This makes sense in general, because we not only use this flag forchattr
, but also by the automatic heuristic code to mark an inode as not being a good candidate for compression.This muddies the water a bit, because for the case of
chattr
we likely really want to disable compression in all cases, even in -o compress-force, but we don't want to do it for things marked with BTRFS_INODE_NOCOMPRESS because it could have been done with heuristics and thus not be completely correct.We need to pick either
My preference is for 1, which would mean you need to
chattr -c
turns off compression if it's currently set, iechattr +c file; chattr -c file;
simply goes to the default, whereaschattr -c file;
gets BTRFS_INODE_NOCOMPRESS set. This may not actually be what we want to happen, and if that's the case we'll still need a btrfs property to flag BTRFS_INODE_NOCOMPRESS. This is the biggest part of this whole thing, you're going to need to solicit feedback from everybody on wether or not this is the behavior we want. The easiest way to do this would be to write a patch to change the behavior and see who complains.