borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.

https://www.borgbackup.org/

Other

11.24k stars 744 forks source link

cache coherency issues leading to a slowdown #8503

Closed ThomasWaldmann closed 1 week ago

ThomasWaldmann commented 3 weeks ago

From https://github.com/borgbackup/borg/discussions/8451#discussioncomment-11066110 :

borg 2.0.0b12

There's an issue with the ChunkIndex cache if a connection breaks down:

borg2 uses a ChunkIndex cache since recently, stored in repository/cache/chunks (and a checksum of it in chunks_hash).
if borg2 create does not complete, it does not update that cache.
in that case, there is still a valid cache, but it represents the repository contents from when the last backup ended normally and does not know about the chunks transmitted in the interrupted backup run.
workaround: if one would kill that cache, borg would rebuild it by listing all objects in the repository (slow, but without much traffic) and it would then represent all currently present chunks in the repo.

ThomasWaldmann commented 2 weeks ago

There can be similar issues if multiple borg create run in parallel: the last one updating the cache wins and knowledge about existing chunks (which were added by the other borg create runs) might go away, leading to a future slowdown.

It never leads to corruption though, because only borg compact removes chunks (and it uses an exclusive lock while working).

Without the cache, that all would be way simpler. Pity that listing all repo objects takes so long that we need a cache.

ThomasWaldmann commented 2 weeks ago

Some ideas about how to solve this:

after loading the main chunks cache from the repository, chunks.* are merged into the in-memory ChunkIndex, written back to chunks and then chunks.<SAME> are removed.
before new chunks are created, existing chunks are marked with F_CLEAN in the in-memory chunkindex. after that, new index entries will be dirty, because they do not have the clean flag.
borg create should write new/dirty chunk index entries to repository/cache/chunks.<RANDOM_OR_HASH> periodically and afterwards mark them with F_CLEAN in memory.
borg compact builds a new chunk index from scratch and must remove all old cached chunk indexes and write an uptodate chunks.

ThomasWaldmann commented 2 weeks ago

Even easier, compared to previous post:

give up the distinction of a main chunks cache and chunks. caches, just always store chunks index data as chunks..
if .* means .HASH, we do not need the chunks_hash extra object anymore.
just merge all cached chunk indexes together when building one from the cache

ThomasWaldmann commented 1 week ago

8531 solves the mentioned issues when running multiple `borg create` in parallel.

ThomasWaldmann commented 1 week ago

8541 saves the new stuff from the chunk index every 10 minutes to repo/cache/chunks.*, so progress won't be lost if connection breaks down or borg is ctrl-c'ed.

note: this refers only to the chunk index, so borg will "know" what chunks are in the repo.

the files cache is currently only saved at the end, so that can still be a problem.

borgbackup / borg

cache coherency issues leading to a slowdown #8503

8531 solves the mentioned issues when running multiple borg create in parallel.

8541 saves the new stuff from the chunk index every 10 minutes to repo/cache/chunks.*, so progress won't be lost if connection breaks down or borg is ctrl-c'ed.

8531 solves the mentioned issues when running multiple `borg create` in parallel.