ayufan / pve-patches

Repository with public Proxmox patches
222 stars 54 forks source link

Performance #1

Open pdirksen opened 6 years ago

pdirksen commented 6 years ago

Can we do something about the performance? Currently we use a CIFS/SMB volume via a 1Gbit/s interface. A full backup for a medium sized VM needs about 2mins whereas it needs about 30mins to finish using xdelta3.

INFO: status: 70% (15040643072/21474836480), sparse 25% (5543481344), duration 1478, read/write 6/5 MB/s

As xdelta3 is only able to use one thread combined with a medium compression this is probably the bottleneck.

jadsolucions commented 6 years ago

Agree

sienar1 commented 6 years ago

Is xdelta3 not threaded at all? That's where I see the bottleneck as well. I have VMs that are 200+GB and can do full backups very quickly, but a differential takes nearly all day because it's stuck in xdelta3 maxing out a single CPU core and not spreading the work out.

Kalimeiro commented 6 years ago

it's not a multithread problem but a "network problem", really it's vzdump the big problem, when you'r backup start, it use your cifs/smb to read the old backup and doing a double write in same time : the vm snapshot in a dat/tmp file to compare with old backup and the delta vcdiff file...

sienar1 commented 6 years ago

I have to disagree. I can run the same differential backups of large VMs on local storage (storage that can handle over 1GB/s of throughput) and the differential backup runs at about 4MB/s, checking CPU usage you can see xdelta3 running on only 1 core, for hours (18 hours for a 200GB VM specifically). It appears to be xdelta3 very poorly threaded. I've also found multiple other support threads where xdelta3 has been used in commercial products and they suffer the same issue. If you have a server with many slower cores (such as my Xeon E5-2650L based server), these differential backups are near useless for any large source VMs which is exactly where you need it to perform well.

Kalimeiro commented 5 years ago

I have to disagree. I can run the same differential backups of large VMs on local storage.

You say in local storage, but the problem is when using CIFS/SMB storage, it read the old backup from CIFS/SMB + it write the actual snapshot in a dat/tmp to the CIFS/SMB and then compare the old backup and the actual snapshot to write the vcdiff file... (double write CIFS/SMB + READ CIFS/SMB) ... it's a design problem that comes from vzdump in addition to the ayufan patch (which is not reactive at all)

I agree with local storage, no problem...

gilbertoferreira commented 4 years ago

Hi I have issues with this too... Before apply this patch, backup tooks 7 hours... After the path, is took more than 18 hours! And the servers all of them has 16 cores.... or more... I am using NFS as storage... Something definitely is very wrong.

jhusarek commented 4 years ago

Hi I have issues with this too... Before apply this patch, backup tooks 7 hours... After the path, is took more than 18 hours! And the servers all of them has 16 cores.... or more... I am using NFS as storage... Something definitely is very wrong.

Hi, I have same issue.

gilbertoferreira commented 4 years ago

Hi there! I have the same issue here too! The file is, indeed, reduced but slow down all vzdump process... How can we do something to improve this?

alebeta90 commented 4 years ago

Hi all,

here having the same problem when differential backup is happening

I'm using the patch for 5.4-5.

normal backups are working at normal speed and differential backups speed drops drastically.

All backups goes to same NFS storage.

marcin-github commented 4 years ago

I did a few test on 5GB .vma files. If you add -3 -B 2147483648 options to xdelta3 you should get noticeably smaller diff copies. Disadvantage is more memory consumption. About speed. xdelta3 is single threaded, it uses gzip which is also single threaded. If you find way to change gzip to pigz you backup should finish a little faster.

gilbertoferreira commented 4 years ago

Perhaps we shoud compile more new xdelta3 version... The one in github is bit older, 3.0.6 I thing... There's a new version here: https://github.com/jmacd/xdelta Which I thing is 3.1 or something

And here too:

http://xdelta.org/


Gilberto Nunes Ferreira

(47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36

Em sex, 30 de ago de 2019 às 20:20, Marcin notifications@github.com escreveu:

I did a few test on 5GB .vma files. If you add -3 -B 2147483648 options to xdelta3 you should get noticeably smaller diff copies. Disadvantage is more memory consumption. About speed. xdelta3 is single threaded, it uses gzip which is also single threaded. If you find way to change gzip to pigz you backup should finish a little faster.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ayufan/pve-patches/issues/1?email_source=notifications&email_token=ACP2OBE2BXWXTTVRA5KSSFTQHGTKNA5CNFSM4EWBTU72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5S7P5Y#issuecomment-526776311, or mute the thread https://github.com/notifications/unsubscribe-auth/ACP2OBH5KPJQIWXWITPKQADQHGTKNANCNFSM4EWBTU7Q .

KlugFR commented 4 years ago

xdelta3 3.0.11 exists but is not downloaded by the patch installer. See: #34

ScIT-Raphael commented 4 years ago

Have the same performance issue here, really small machine takes ways to long for a differential backup. Already running xdelta3 3.0.11, on backup routine it only takes one cpu core. Backup target is a cifs storage service (storage box from hetzner.com), full backups just takes 1 or 2 minutes and working properly.

I like the differential solution, but with that bad performance I can't let it run on my other proxmox servers with larger vms. Is there any advice how to optimize it?

Backup log:

INFO: starting new backup job: vzdump 101 --all 0 --compress lzo --mailnotification failure --maxfiles 30 --mode snapshot --quiet 1 --mailto support@xxx.xx --fullbackup 30 --node pm103 --storage backup
INFO: doing differential backup against '/mnt/pve/backup/dump/vzdump-qemu-101-2019_12_20-08_45_02.vma.lzo'
INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2019-12-20 08:47:17
INFO: status = running
INFO: update VM 101: -lock backup
INFO: VM Name: fw03.xx.xx.xx
INFO: include disk 'scsi0' 'data:vm-101-disk-0' 60G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-101-2019_12_20-08_45_02.vma.lzo--differential-2019_12_20-08_47_17.vcdiff'
INFO: started backup task '3033d581-542d-4f46-9ce6-d939874c7524'
INFO: status: 10% (6830620672/64424509440), sparse 10% (6820663296), duration 4, read/write 1707/2 MB/s
INFO: status: 11% (7492009984/64424509440), sparse 11% (7237709824), duration 38, read/write 19/7 MB/s
INFO: status: 14% (9460973568/64424509440), sparse 14% (9090572288), duration 57, read/write 103/6 MB/s
INFO: status: 20% (13451329536/64424509440), sparse 20% (13062885376), duration 61, read/write 997/4 MB/s
INFO: status: 23% (15379267584/64424509440), sparse 23% (14967386112), duration 65, read/write 481/5 MB/s
INFO: status: 24% (15470493696/64424509440), sparse 23% (14968123392), duration 77, read/write 7/7 MB/s
INFO: status: 25% (16109076480/64424509440), sparse 23% (15011840000), duration 168, read/write 7/6 MB/s
INFO: status: 26% (16755261440/64424509440), sparse 23% (15303970816), duration 208, read/write 16/8 MB/s
INFO: status: 27% (17412849664/64424509440), sparse 24% (15700697088), duration 236, read/write 23/9 MB/s
INFO: status: 28% (18059034624/64424509440), sparse 24% (16054276096), duration 264, read/write 23/10 MB/s
INFO: status: 29% (18686214144/64424509440), sparse 25% (16454565888), duration 282, read/write 34/12 MB/s
INFO: status: 30% (19336200192/64424509440), sparse 26% (16785715200), duration 307, read/write 25/12 MB/s
INFO: status: 31% (19982385152/64424509440), sparse 26% (17112711168), duration 385, read/write 8/4 MB/s
INFO: status: 32% (20620967936/64424509440), sparse 26% (17158856704), duration 566, read/write 3/3 MB/s
INFO: status: 33% (21270953984/64424509440), sparse 27% (17622482944), duration 612, read/write 14/4 MB/s
INFO: status: 37% (23984472064/64424509440), sparse 31% (20185636864), duration 650, read/write 71/3 MB/s
INFO: status: 51% (33088536576/64424509440), sparse 45% (29283241984), duration 653, read/write 3034/2 MB/s
INFO: status: 53% (34527707136/64424509440), sparse 47% (30698168320), duration 656, read/write 479/8 MB/s
INFO: status: 66% (43034148864/64424509440), sparse 60% (39193022464), duration 659, read/write 2835/3 MB/s
INFO: status: 75% (48851648512/64424509440), sparse 69% (44998873088), duration 662, read/write 1939/3 MB/s
INFO: status: 84% (54398222336/64424509440), sparse 78% (50537148416), duration 665, read/write 1848/2 MB/s
INFO: status: 94% (61080600576/64424509440), sparse 88% (57219432448), duration 668, read/write 2227/0 MB/s
INFO: status: 100% (64424509440/64424509440), sparse 94% (60563333120), duration 670, read/write 1671/0 MB/s
INFO: transferred 64424 MB in 670 seconds (96 MB/s)
INFO: archive file size: 1.64GB
INFO: Finished Backup of VM 101 (00:11:11)
INFO: Backup finished at 2019-12-20 08:58:28
INFO: Backup job finished successfully
TASK OK
KlugFR commented 4 years ago

My guess is that if you want multi-core, you need to use LZMA and a multi-core compiled version of LZMA.

ScIT-Raphael commented 4 years ago

Thanks for the answer @KlugFR, is there any docs how this can be done?

vdeville commented 4 years ago

Hello, Confirmed on a vm of 100gb. 10minutes for full backup, few hours for differential. Thanks

ScIT-Raphael commented 4 years ago

Nothing new about how to resolve the issue?

ayufan commented 4 years ago

@MyTheValentinus @ScIT-Raphael Maybe try it with the recently released zstd?

vdeville commented 4 years ago

Hello, I use zstd on this server. Maybe problem with zstd ? Recently dist upgrade this proxmox (15 days ago) Thanks

Sent from my iPhone

On 2 Jun 2020, at 15:40, Kamil Trzciński notifications@github.com wrote:

 @MyTheValentinus @ScIT-Raphael Maybe try it with the recently released zstd?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

vdeville commented 4 years ago

Standard backup zstd mode snapshots: 110-130 mb/s Differential backup on same target: 6-10 mb/s

JoeApo108 commented 4 years ago

Completely the same situation. In this current state, it's unusable. I'm using the latest patch with the latest pve-xdelta3 3.0.11

ScIT-Raphael commented 4 years ago

Same here, latest verison, still slow as hell :(.

ayufan commented 4 years ago

Hmm. Is the xdelta3 to be single threaded? Can you show CPU usage of individual processes?

vdeville commented 4 years ago

When i look the cpu usage, no core are 100%, i'm not sure that is linked to the single thread of xdelta. Before, other old version work fine in single core.

JoeApo108 commented 4 years ago

Testing with edited /etc/vzdump.conf Added zstd: 0 (which utilized half of all available cores)

After this, I can see 28 cores in use out of 56. Before it was just 1. Backup still in process at the moment. Will keep you informed.

ScIT-Raphael commented 4 years ago

@JoeApo108 Do you already have some feedback? Went the backup trough properly and fast?

JoeApo108 commented 4 years ago

@JoeApo108 Do you already have some feedback? Went the backup trough properly and fast?

No, it didn't help at all...still slow. The full backup of 320GB took 1h40m. The diff is ongoing 6h and still processing.

Ok, it might have smth to do with the source window size (-B switch). When I tested it with LXC of the tiny size than the full backup took (1.4GiB, 29MiB/s, archive size 410MB) and diff backup (1.4GiB, 23MiB/s, archive size 700kB)

The huge LXC full backup took (320GiB, 52MiB/s, archive size 160GB) and diff ( ???? still in process)

JoeApo108 commented 4 years ago

Hmm. Is the xdelta3 to be single threaded? Can you show CPU usage of individual processes?

https://imgur.com/a/bgB88ix

vdeville commented 4 years ago

Hello, Any news ? Thanks

umm0n commented 4 years ago

We're having the same problem, using Proxmox v6.2-4 and the latest xdelta3.

ZSTD full backup is quick, differential against it takes hours and hours until it basically stalls. Any solutions?

ScIT-Raphael commented 4 years ago

Also would move to have a solution here, currently we just had to remove it again and switch back to normal backup. It's not usefull when a diff backup takes much longer than a full...

ayufan commented 4 years ago

I will do some regression testing to see why :)

On Thu, 18 Jun 2020 at 08:07, Raphael Schneeberger notifications@github.com wrote:

Also would move to have a solution here, currently we just had to remove it again and switch back to normal backup. It's not usefull when a diff backup takes much longer than a full...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ayufan/pve-patches/issues/1#issuecomment-645795578, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASOSQNO3FQHEB4ZE47AP73RXGVITANCNFSM4EWBTU7Q .

ayufan commented 4 years ago

I run some testing xdelta3, and well performance is kind of miserable. It really depends on the amount of changes. I'm trying different settings to check the impact on size and performance to maybe find a balance.

vdeville commented 4 years ago

@ayufan Ok nice to hear that your have same issue. We waiting the fix thanks

marcin-github commented 4 years ago

https://pbs.proxmox.com/wiki/index.php/Main_Page does dedup and compression. I think this solution from ayfan (thank you Kamil) will become deprecated.

Genzo4 commented 4 years ago

https://pbs.proxmox.com/wiki/index.php/Main_Page does dedup and compression. I think this solution from ayfan (thank you Kamil) will become deprecated.

+

1.3 Main Features ... Incremental backups Changes between backups are typically low. Reading and sending only the delta reduces storage and network impact of backups

ogghi commented 3 years ago

Would it make ayfan's solution deprecated though? I would need another server to be running just for backups? I like the integrated solution. (I found this thread because I was looking into the performance issues)

ayufan commented 3 years ago

Both yes

On Sun, 23 Aug 2020 at 22:52, ogghi notifications@github.com wrote:

Would it make ayfan's solution deprecated though? I would need another server to be running just for backups?

I like the integrated solution.

(I found this thread because I was looking into the performance issues)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ayufan/pve-patches/issues/1#issuecomment-678823651, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASOSQM2NHQAJ4EXWUZDY5LSCF6P7ANCNFSM4EWBTU7Q .

sienar1 commented 3 years ago

Would it make ayfan's solution deprecated though? I would need another server to be running just for backups? I like the integrated solution. (I found this thread because I was looking into the performance issues)

Ayufan's solution is not integrated though, it's a modification that had a serious performance problem. PBS can be run in a container (I have it running in a Debian container) OR in a VM OR on bare metal. Any of those can be run anywhere you want. PBS is more flexible and more performant.

ayufan commented 3 years ago

Yes, additionally it is build differently not having the same performance deficiency. Xdelta does not work well and being super performance. The best way here is to work on blocks and compare blocks of fixed size. This gives very predictable performance or just use copy on write.

On Sun, 23 Aug 2020 at 23:41, sienar1 notifications@github.com wrote:

Would it make ayfan's solution deprecated though? I would need another server to be running just for backups?

I like the integrated solution.

(I found this thread because I was looking into the performance issues)

Ayufan's solution is not integrated though, it's a modification that had a serious performance problem. PBS can be run in a container (I have it running in a Debian container) OR in a VM OR on bare metal. Any of those can be run anywhere you want. PBS is more flexible and more performant.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ayufan/pve-patches/issues/1#issuecomment-678828545, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASOSQM25URP2VCFRBW5OR3SCGEJVANCNFSM4EWBTU7Q .

ogghi commented 3 years ago

All right, will try PBS then on Openmediavault :+1: