Open pascal-meunier opened 6 years ago
@lacunapremise adding a module option sounds entirely reasonable to me. If you're comfortable doing so please go ahead and open a PR against master with the change. If you need an example take a look at how this is done for spa_slop_shift
. You'll also need to add a description in man/man5/zfs-module-parameters.5
for the new option.
There is also a case for reducing the shift: when physical block size is closer to volblocksize/recordsize or when stripe widths are large in draid. The valid range for the shift is 1 to 15, though I expect the higher shifts to be unusual.
@lacunapremise I'll be happy to help and review
I think it would be more versatile if this would be a pool- or even dataset level property, like being able to
zfs set compression=<algo>-always
@GregorKopka can you explain what you mean by "-always"?
The problem with compressors is that they don't always compress. So the current 12.5% (1 in 8 or 512 bytes in 4kB) limit has the effect of ignoring compression when the data doesn't compress to a smaller size.
My comment was purely toward:
I wish there was a module parameter to ... disable the test and always compress.
and @behlendorf who said:
adding a module option sounds entirely reasonable to me.
I think it would be better if that could be controlled on a dataset basis (instead of a system-wide one, affecting all pools), as an idea of how it could be done: a suffix to the compression property value, similar to how one can set the strength with gzip[-<level>]
, to disable the minimum efficiency criterion check.
Ideally we shouldn't have to overload the existing property, add a new property or, a module option to optimize this case.
As @richardelling pointed out above the critical thing is that it's only worth keeping the compressed version when it's at least a full 2^ashift bytes smaller, which is what the shift is trying to accomplish. Updating this function to be aware of the ashift restrictions would allow it always do the right thing on a per-block basis without the need for any manual tuning.
This was less of a issue when the maximum block size was limited to 128k. But with 16M blocks even a 1% savings would be worth it.
On backup servers, compression is much more important than in the general use case for ZFS. It would benefit backup servers to decrease from 12.5% (1/8) to 6.25% (1/16) or even less the minimum compression criterium used to decide if data will be stored compressed, or even always compress.
In zfs/zio_compress.c, the test level is reached via 3 binary shifts d_len = s_len - (s_len >> 3); I'd like to have the option to use 4 shifts: d_len = s_len - (s_len >> 4); which translates to a 6.25% compression test instead of 12.5%.
I wish there was a module parameter to change the number of shift operations, or to disable the test and always compress. At this time I just change the 3 for a 4 and recompile, but doing that prevents backup servers from benefitting from packages.
(edited to correct math and preconception that lz4 was always used to test)