memory usage is too high

ThomasWaldmann commented 9 years ago

To accelerate operations, attic keeps some information in RAM:

repository index (if a remote repo is used, this is allocated on remote side)
chunks cache (telling which chunks we already have in the repo, for not storing same stuff twice)
files cache (telling which filenames, mtimes, etc. we already have in the repo, so attic can just skip these files)

In this section (and also the paragraph above it), there are some [not completely clear] numbers about memory usage: https://github.com/attic/merge/blob/merge/docs/internals.rst#indexes-memory-usage

So, if I understand correctly, this would be an estimate for the ram usage (for a local repo):

chunk_count ~= total_file_size / 65536
repo_index_usage = chunk_count * 40
chunks_cache_usage = chunk_count * 44
files_cache_usage = total_file_count * 240 + chunk_count * 80
mem_usage ~= repo_index_usage + chunks_cache_usage + files_cache_usage
                      = total_file_count * 240 + total_file_size / 400
All units are Bytes.
It is assuming every chunk is referenced exactly once and that typical chunk size is 64kiB.

E.g. backing up a total count of 1Mi files with a total size of 1TiB:

mem_usage = 1 * 2**20 * 240 + 1 * 2**40 / 400 = 2.8GiB

So, this will need 3GiB RAM just for attic. If you run attic on a NAS device (or other device with limited RAM), this might be already beyond the RAM you have available and will lead to paging (assuming you have enough swap space) and slowdown. If you don't have enough RAM+swap, attic will run into "malloc failed" or get killed by the OOM Killer.

For bigger servers, the problem will just appear a bit later:

10TiB data in 10Mi files will eat 28GiB of RAM
1TiB data in 100Mi files will eat 28GiB of RAM

anarcat commented 9 years ago

so could these caches be turned into fixed-size (say relative to available RAM for example) LRU caches? in other words, are they really caches (that we can discard) or indexes (that we can't discard)?

ThomasWaldmann commented 9 years ago

So, the question now is "what are the options to deal with bigger data amounts?".

Some ideas:

increase chunk size, e.g. from 64Ki to 1Mi
- reduces chunk count related memory usage to 1/16
- reduces other related resource usage (CPU, I/O, metadata size) also
- less good deduplication / dedup granularity
- more efficient compression due to larger chunks
switch off files cache (there's a PR open for this option)
- worse speed for 2nd+ backup
have chunks cache on local disk only (use special on-disk code instead of in-ram code)
- likely much worse speed
have chunks cache on local disk only (mmap this file into ram)
- maybe not much different in speed compared to previous option, but code reusage
- existing chunks cache load/save code might even get simpler
- compared to previous option, would never run out of swap space as paging happens directly from/to the chunks cache file
do not use a chunks cache, query repo for chunks presence
- worst speed
buy (new machine with) more RAM

ThomasWaldmann commented 9 years ago

@anarcat they are caches in the sense that they cache information from the (possibly remote) repository. So you could kill them and they could be rebuilt from repo information (or from fs when creating the next archive).

LRU won't help as for the files every entry is accessed only once per "attic create". For the chunks, there are sometimes multiple accesses, but not in a way where LRU would help.

anarcat commented 9 years ago

ah right, so even if the caches would be reused, not much because it's only for "within a filesystem" deduplication...

okay, so another strategy, which you seem to already have a few ideas for.. i guess the next step is benchmarks, as there are fairly low hanging fruits there (chunk size, for one..)

level323 commented 9 years ago

My 2 cents is that the chunk size and whether or not the cache should be maintained in RAM will depend on the particular circumstances to which attic is being applied as there are many use cases, variables and trade-offs to consider.

Therefore, my present assessment is that it makes sense to:

offer an option to specify chunk size at attic repo creation time, and
gracefully and automatically fail-over to on-disk storage of the cache when a (preferably user-specifiable) RAM usage threshold is exceeded.

Regarding point 2, modern linux kernels support per-cgroup resource limiting. So one way to address seamless fallback from RAM to disk would be to put attic in a cgroup with whatever resource limits and swappiness suit their particular use case. However, this may be considered a bit of a hack and, of course, will not help Mac or Windows users.

mathbr commented 9 years ago

@ThomasWaldmann as requested on #300 here is a bit more data from my setup: my media weighs 2.8 TB and currently has 6109 files. Usually memory usage of Attic was ~11% but at the end it was mostly ~50%. Right before Attic died the usage went up to ~70%. Let me know if you need more details.

ThomasWaldmann commented 9 years ago

@mathbr ~70% of 8GiB is ~5.6GB. The formula computes 6.5 (5.3 if remote repo) GiB RAM usage for your backup data. As the formula does not consider all of attic's memory needs, just the repo index and files/chunks cache, it seems to fit. If you had some other stuff running besides attic and your swap space wasn't very large, that maybe was all the memory you had.

mathbr commented 9 years ago

Well there where indeed a few apps running in parallel, most of the memory being claimed by Chromium and Plex Media Server, everything else is rather lightweight (running Xfce as desktop).

My swap is at 2GB which is is not much but with 8GB I actually shouldn't need it at all. ;-)

ThomasWaldmann commented 9 years ago

about mmap: see https://github.com/jborg/attic/commit/2f72b9f96001310ca4a81f8336545f2a3dd1de04

mathbr commented 9 years ago

~~Has anyone tried again with that latest change yet? I'd like to know in advance how this fares before giving it another try. ;-)~~ Just noticed that this change was from July 2014, nevermind.

jborg / attic

memory usage is too high #302