Open tt12 opened 10 years ago
I use encryption for the backup but tried a repo without encryption. I even tried backing up from one internal drive to an other. Still get the same speed. My CPU usage is around 50% so don't think that's the problem either..
Encryption is usually not the bottleneck, data compression is. How many CPU cores does your system have? If you have two cores "50% cpu usage" might mean that Attic uses 100% of one core.
Future versions of Attic will hopefully be able to utilize more than one cpu core and also allow disabling on data compression.
Btw for comparison, what kind of throughput do you get on your system if you backup your data with zip instead of Attic?
Just ran a test with Zip instead and got about 20-25 MB/s with 100% cpu usage (on one core out of two) Where attic gave about 12MB/s with 50% cpu usage (on one core out of two).
It would be nice have an option to use another compression (or none) because most of my files are already compressed. Or maybe it would be possible to check if a file is compressing well and if not skip compressing it. That way we would have the best of both worlds.
Ok, so the cpu is not the bottleneck...
Are you able to try the GIT version which includes some performance improvements?
I'll give it a try but it won't be for a few days.
Thanks for the fast reply's :)
I have just tested with the git version as of now (Attic 0.13_8_g21e03af-py3.2-linux-x86_64.egg) it doesn't look to have changed much. Now attic writes with about 13MB/s with 60% CPU usage.
Weird, I can't really understand what's stopping Attic from using 100% of the cpu when we know that the disk io is fast enough (at least when using zip).
Here's a small patch you could try which disables xattr and acl metadata collection. But that's a bit of a long shot: https://gist.github.com/jborg/e8f8d7b3205eee93f613
Btw, are you able to re-run your zip benchmark and this time make sue the page cache is empty by running the following command before you start:
sync; echo 3 > /proc/sys/vm/drop_caches
Okay just tested with your patch applied and it dosen't look like it has made a difference. Write speed is stil the same and about the same cpu usage as the very first test. Now I have tried running 'drop caches' command before running zip and i've got the same results as before (20-25MB/s, 100%CPU usage, on one core).
I have tried a test on another computer with an ssd. write speed was about 20MB/s and 70% cpu usage on one core. I ran a test on the same computer where it backups to a ramdisk and got 15-25MB/s write and about 90% cpu usage on one core. So even with a ramdisk it dosen't max the cpu as zip does
I see very similar results. Around 12MB/s through Attic, and zip is around 25-35MB/s. CPU usage on dual core is around 35%.
Not maxing out 1 cpu core can only mean source or destination medium lets cpu wait to much (I/O wait), e.g. due to necessary hdd seeks and rotational latency. HDDs are usually maxed out at ~100 IOPS. Try a SSD?
Just as a performance note:
My experimental merge-all branch code does 64MB/s with encryption (no compression) and 75MB/s without encryption (and also no compression).
On my laptop (i5-4200, SSD).
So there's no fundamental performance problem in attic, just parameters need to get set right for the target environment. I could imagine it being faster, though - esp. if the CPU is not maxed out and there is not much latency (due to SSD usage).
So, I have been playing around with attic to see how it deals with large repositories, but doing so I noticed terrible performance:
$ git rev-parse @
64e8ea72ac26f1c0fdbae8cf652b78e23564fbbc
$ attic init /mnt/attic/jborg
$ /usr/bin/time -v attic create --stats /mnt/attic/jborg::first $BIGFOLDERS
This system have a core i7-2600k, 16 GB of ram and the disks Im testing both to and from is two raid5's in the same lvm volume group. The layout as it is right now means that data is read from one physical raid and written to the other. The folders I'm trying to backup is almost 10TB and contains around 150k files.
Looking at htop
I can see that the attic process uses 50% CPU (of one core).
using iotop
I can see that 40% IO is used by attic, and using dstat -tdD md126,md127
shows me reading in 20 MB/s from one array and writing 8 MB/s to
the other.
Running simple performance benchmarks I get:
$ dd if=/dev/zero of=testfile bs=1G count=30 oflag=direct
32212254720 bytes (32 GB) copied, 149.231 s, 216 MB/s
$ dd if=/a/bigfile of=/dev/null
150000000 bytes (150 MB) copied, 0.749974 s, 200 MB/s
Thus, neither my CPU nor disks is utilized yet the performance is very slow.
Doing the same benchmarks on attic/merge-all yields 20 MB/s both read and write and 20-25% CPU usage.
I'm more than happy to run more benchmarks, if required
Hope it helps
I guess with a single-threaded / single-process attic, there is some I/O wait time when it really just waits for I/O to complete and does nothing else (this happens for read latency as well as for writes when doing an fsync).
Also, there is no dispatching of compute intensive stuff to different cores. And it also won't start some I/O while computing some hash, compressing something or encrypting it.
I am currently trying to make it multi-threaded to see how fast it gets.
Of course Python's GIL limits this a bit, but maybe not as much as one would think: I/O operations and computations in C extensions that release the GIL are not a problem, so we might be lucky...
Using attic 0.16 I can achieve 16 MB/s, using 75% CPU.
Copying from and to the same source and destination, with rsync, I can achieve 100 MB/s.
I would like to backup 50-100 TB's of data, so the speed is of course very important.
@MartinAyla you should probably check out https://github.com/attic/merge/issues/4 (the problem is more severe on ordinary attic). 50-100TB of data might require quite a bit of RAM.
Are people seeing IO-wait (wa
) in top
? That would explain the confusion about not maxing out cpu usage, because pure IO-wait is not treated as cpu usage.
When attic is cpu-limited, I have an improvement that should avoid IO-wait caused by writes. Or rather, by fsync(). It can save quite a few percentage points. I will submit it some time when I'm not about to go to bed :).
fsync() after each segment write is suboptimal! It means you stop (cpu) processing to wait for the physical disk write. And the default segment size is 5MB. (I noticed bup
avoids this issue by writing pack files of 1GB by default :).
Reads should also cause some IO-wait. I think they could be prefetched them, although since the backed-up files will vary in size it won't be quite as nice to implement.
@sourcejedi A while ago I found the same stuff as you did. fread is causing iowait (obviously) and also fsync is causing iowait. in a single threaded application, that time is basically lost, it just sits there waiting, that is why I work on multithreaded code since a while (it's not finished yet, though). that way, while still making sure stuff gets really written to disk asap, you can still use the time in another thread.
Hey, really nice program you have created. I have just one problem. I backup to an external harddisk but I only get speeds around 12MB/s max. I use encryption for the backup but tried a repo without encryption. I even tried backing up from one internal drive to an other. Still get the same speed. My CPU usage is around 50% so don't think that's the problem either..
Appreciate any feedback.