nefelim4ag / systemd-swap

Script for creating hybrid swap space from zram swaps, swap files and swap partitions.
GNU General Public License v3.0
546 stars 81 forks source link

dm-cache for zram #71

Open Enelar opened 5 years ago

Enelar commented 5 years ago

I've been a huge fan of using memory swap for several years, and I've been trying different setups with different configs. IMO zram works better than anything else. However it's limited. And when you use it with disk bundle it has a huge issue - once its full, it becomes a dead body in your physical memory space. And you meet LAAAGS cause of disk IOPS.

That's why I had this idea of pushing LRU to the disk once zram is full. I know there is zswap, but it using write-through (which is even worse: under load, you're system starving of file cache, free memory and using both ram and drive. anyway I didn't experience any improvements for desktop systems).

Instead of write-through, it should be write-back in the background. dm-cache seems to be good candidate for it. if I will do it, by any chance, it might get merged or not?

Summary:

Vision: There should be several levels of swap:

SSalekin commented 5 years ago

@Enelar Hi, I like your idea. However, I'm interested in this

There should be several levels of swap:

- zram lz4
- zram deflate
- disk drive

Can you please elaborate why? I'm interested on it's possible benefits.

nefelim4ag commented 5 years ago

@Enelar, zswap is writeback, it's use write-through only for incompressible data.


Yep, that can be merged.

polarathene commented 3 years ago

Just so you know, zram does have a write-back feature as well:

With CONFIG_ZRAM_WRITEBACK, zram can write idle/incompressible page to backing storage rather than keeping it in memory. To use the feature, admin should set up backing device before disksize setting via: echo /dev/sda5 > /sys/block/zramX/backing_dev It supports only partition at this moment.

The feature appears to have arrived around Dec 2018. Besides the fact your kernel must have enabled the config option and the partition as only backing storage option, this doesn't seem to be automatically handled writeback. You can mark all pages as idle, then check back periodically to invoke writeback as any accessed pages will remove the idle association.

As for the dm-cache approach, someone describes doing this back in late 2013.

This article also seems to suggest the effectiveness of dm-cache will vary, if data is being written to slower disk storage and only cached within zram when dm-cache decides it's cache-worthy data, you may find the effectiveness is worse, especially as pages are updated/modified?(pulled from swap, then invalidated when swapped back in due to modification IIRC?)

It excels as a read cache where frequently accessed files (hot-spots) can be promoted to the cache over a period of multiple accesses (slow fill cache).

Seems like it has the same drawback of zram writeback, requiring block device as backing storage (technically storage to cache), in addition to needing a 2nd storage device for metadata. You just don't have manually manage/automate the LRU caching.

polarathene commented 3 years ago

Additional notes.

With all that in mind, you could presumably also achieve the desired goal of avoiding LRU inversion of zram by adding zswap into the mix with a smaller memory cache (since zram will compress better). Once the zswap cache fills up, it'll decompress the zbud/z3fold pages and compress them again in zram or if that's full send them to the disk based backing swap.

LRU is avoided in the sense that zswap will cache the frequently used pages at the expense of extra copies and added compression/decompression overhead. Compared to a dm-cache approach, you'd also have duplicates in RAM or write to disk before caching to dm-cache on zram works, which is probably not desirable.

If you can have zram prioritized over zswap, it's probably the same result minus the drawback of using zram as a backing store for zswap? Granted zram swap still never evicts it's stale pages if you're using an excessive amount of disk swap, then the dm-cache + zram route might work out better.

Or just use zswap if the additional compression ratio potential isn't worthwhile. You may find you get more out of it due to sizing capacity of the cache on compressed size in RAM not uncompressed size that zram uses.