openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.42k stars 1.72k forks source link

enable KSM for ARC cache #14279

Open devZer0 opened 1 year ago

devZer0 commented 1 year ago

@Sachiru

For non-deduplicated datasets or filesystems, ARC cache retains full blocks in memory even if they are duplicates of something else.

KSM (Kernel Same Page Merging, http://en.wikipedia.org/wiki/Kernel_SamePage_Merging_(KSM) ) is supposed to optimize memory usage especially for memory-heavy applications. Although it is true that blocks have variable sizes, they are still allocated as 4k pages in memory (IIRC), which can then be examined and deduplicated

@behlendorf

This situation here will be considerably better in the 0.7.0 release. ARC buffers are now compressed in memory and the ARC is better about not keeping multiple copies of the same buffer.

i checked this , and this does not seem to apply for zfs-2.1.6.

i created 10 identical 100mb files (test#.dat) with contents from /dev/urandom, dropped caches with "echo 3 >/proc/sys/vm/drop_caches" and read those file with "cat test*.dat >/dev/null".

before reading, arc was <200mb, after reading arc was at 1,2GB

that means, arc is not able to detect that the files contents are the same

so, hereby i'm reopening https://github.com/openzfs/zfs/issues/2772

if arc has no internal deduplication, it should benefit from standard kernel memory deduplication feature.

ram is precious ressource and most systems have lots of unused cpu.

ryao commented 1 year ago

KSM is meant for anonymous pages of child processes that are not from files. As far as I know, the KSM code as designed cannot be applied to either the page cache or ARC, so it is not something that can be enabled.

The deduplication that @behlendorf mentioned was for cases where ZFS should know that the buffers are the same, such as when the files have the same places on disk in snapshots and you are looking at them from snapshots. It does not apply to identical files unless dedup=on was present when they were written.

Offhand, the way that I would expect KSM to work is that it periodically hashes the anonymous pages and stores those hashes in a data structure. Upon getting a hit, it will mark the page as CoW in both places, verify that the two are the same and then have one point to the other while increasing the page's reference counter. Implementing the idea in ARC would be non-trivial. I would expect getting it right to require significant effort spanning at least a year. :/

Haravikk commented 1 year ago

I posted an issue suggesting "lightweight" deduplication and I wonder if it would cover this case?

Basically in issue #13572 my proposed/preferred solution is to allow deduplication to be enabled only for the contents of the ARC (and L2ARC), rather than for the entire contents of a dataset, to massively reduce the RAM impact of dedup.

The intention of that issue is to enable dedup for file copying, since this usually involves reading records into ARC (and thus the "lightweight" dedup table) shortly before writing out the new copies. Since they'd be in the dedup table, they would instead be written out as cloned blocks (reflinks) rather than full copies.

If that issue were implemented first, then the same basic mechanism could be used to dedup the contents of ARC, because there would be a dedup table there to be used. Basically if two identical records are loaded, they'd generate the same hash for the dedup table, and so one can be eliminated in ARC (or even retroactively eliminated on disk).