vatesfr / xen-orchestra

The global orchestration solution to manage and backup XCP-ng and XenServer.
https://xen-orchestra.com
Other
778 stars 263 forks source link

XenServer 7.1 upgraded to XCP-NG 7.5; XO 5.24.2 - BackupNG tasks fail with VDI_IO_ERROR(Device I/O errors) #3314

Closed nyash closed 6 years ago

nyash commented 6 years ago

Context

Expected behavior

Hello.

Recently upgraded two servers from XenServer 7.1 directly to XCP-NG 7.5 via the iso image. Both servers are simply using RAID10 local storage and are their own pool masters. Both are connected via 1Gbit/s links with <3ms latency.

Since I don't have much configuration to lose I also installed xenorchestra from scratch and removed any continous replication backup snapshots (XO_DELTA_EXPORT.*) created by the Legacy Backup system to start from a clean slate.

After setting up a Continous replication task to backup VMs from one server's local storage to the other server's local storage I noticed that all the tasks fail with VDI_IO_ERROR(Device I/O errors).

Here's an example of a simple backup task entry that fails:

https://i.imgur.com/wodcY7W.png

Here's the log of the backup:

https://i.imgur.com/DiG4aoX.png https://pastebin.com/azcUfzk2

TL;DR Legacy Backup Continuous replication worked, BackupNG doesn't.

Current behavior

Continous replication backups fail with VDI_IO_ERROR(Device I/O errors) regardless of VM count, VM size, Concurrency level, etc.

olivierlambert commented 6 years ago

Can you tell us if you can do successfully Delta backups? (ie: VDI exports behind the scene). If yes, can you confirm that the issue is during the restore operation? (ie: VDI import)

nyash commented 6 years ago

I have performed the following.

Created a new remote target in: https://xo/#settings/remotes, where: Type: local Name: localtest Path: /backuptest

Then added a new BackupNG task, where:

Name: backup Type: Delta backup Target remote: The remote previously created Schedule: Some schedule to satisfy the wizard

Then ran the BackupNG task manually.

It has succesfully completed. (Of course this has created a backup file inside xenorchestra's VM locally, not sure if its what I was supposed to do, as I don't have a NFS share or SMB to test - the other possible targets for delta backup. I can create some if its necessary).

As for the continuous replication, it seems that the export completes (reaches 100%), but the import fails halfway on the other machine. (Made a gif below):

https://i.imgur.com/7ALhEI6.gif

rizaemet commented 6 years ago

I have same issue. delta backups restore operation fail many backups with VDI_IO_ERROR(Device I/O errors). I think this issue solved with xcp-ng 7.5.

olivierlambert commented 6 years ago

This is probably different. The issue we fixed on 7.5 was on VDI export. On @nyash it sees to be triggered on VDI import.

We'll continue to investigate.

rizaemet commented 6 years ago

I upgraded xo-server 5.25.1 and xo-web 5.25.0. Xcp-NG 7.5 and patches installed. still I can not import (restore) delta backups which successfully exported. So I can not take "Continuous Replication" backup because it is include a vdi-import process. maybe helpful: Delta backup import error and Continuous Replication error appear at end of process. If you need a log or something info I can share.

olivierlambert commented 6 years ago

We identified a possible issue in XCP-ng 7.5 import, @nraynaud is currently working on it :+1:

olivierlambert commented 6 years ago

This is fixed since last update of XCP-ng (last week), please yum update or use XO to update all your hosts. No reboot nor toolstack restart required.

nyash commented 6 years ago

Hello,

Thank you for the heads up.

I have upgraded my xen servers "yum update && yum upgade && reboot" (reboot just to be sure).

Then I have installed the latest (as of writing XO, 5.25.2) from sources.

CR Backups fail immediatelly with this error: https://i.imgur.com/khwQrbz.png log: https://pastebin.com/P3JGbA6m

olivierlambert commented 6 years ago

Please re-read the error message carefully, then go into the Settings/server view, and double check the "Read only" mode is not enable for the pool in question.

nyash commented 6 years ago

@olivierlambert Thank you. Indeed the read only mode had to be toggled off in the settings for each xen server. Easy to miss. I think the error is a bit vague, though. It didn't occur to me what could be the cause (and how to correct it). (just by looking at the error alone and associated stacktrace).

Anyway thank you, everything seems to be fine now ;-)

olivierlambert commented 6 years ago

Remember to NOT add each host, just the pool master only (if you have pools with more one host obviously). Enjoy XO!