jborg / attic

Deduplicating backup program
Other
1.11k stars 104 forks source link

attic crashed with "malloc failed" #353

Open wkoszek opened 9 years ago

wkoszek commented 9 years ago

I run attic on Synology ds214play. It has 1GB of RAM. Build came from https://attic-backup.org/downloads/releases/0.16/Attic-0.16-linux-i686.tar.gz archive. Missing library libacl.so.1 came from ubuntu/trusty32 Vagrant image. My unit has 2x1TB disks working in mirroring. My /volume1/homes/wkoszek has 876G of data. I have 1TB WD disk connected via USB (/volumeUSB2/usbshare). I started from an empty USB disk:

time attic create --stats backup-all.attic::20150926 /volume1/homes/wkoszek

Attic seems very slow. It took 140hr to get:

/dev/sdt1       932G  383G  549G  42% /volumeUSB2/usbshare

of data archived from my NAS to the USB disk, and I don't think attic succeeded.

After ~6 days Attic got an error:

/volumeUSB2/usbshare # time attic create --stats backup-all.attic::20150926 /volume1/homes/wkoszek
^[[Ahashindex: malloc failed
Traceback (most recent call last):
  File "/root/build/prefix/lib/python3.4/site-packages/cx_Freeze/initscripts/Console.py", line 27, in <module>
  File "prefix/bin/attic", line 3, in <module>
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 730, in main
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 720, in run
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 129, in do_create
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 178, in _process
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 178, in _process
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 178, in _process
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 164, in _process
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archive.py", line 416, in process_file
  File "/root/build/prefix/lib/python3.4/site-packages/attic/cache.py", line 230, in add_chunk
  File "attic/hashindex.pyx", line 162, in attic.hashindex.ChunkIndex.__setitem__ (attic/hashindex.c:3113)
Exception: hashindex_set failed
Command exited with non-zero status 1
real    140h 47m 28s
user    34h 51m 36s
sys     7h 8m 32s

I'd be interested in hearing whether people use Attic on archives of ~1TB size.

ThomasWaldmann commented 9 years ago

@wkoszek attic's codebase currently can't cope in a good way with configurations that have little cpu/memory AND a big data set to backup.

The problem is that attic keeps the chunks (and files) index in memory and these grow with the amount of chunks (and files). As attic uses small chunks (approx. 64KB in the statistical middle), there will be a lot of chunks if you have a lot of data.

Adding swap space might solve the memory issue, but will make it even slower, so isn't a good solution. Adding memory (RAM) would really help.

In my repo, I implemented changes to make the chunker configurable, so it creates fewer chunks. Also you can use different compression, which might also help with the speed.