lamw / ghettoVCB

ghettoVCB
MIT License
1.28k stars 364 forks source link

ESXi6.7 - More than one disk causes backup to fail #264

Open thraxarious opened 2 years ago

thraxarious commented 2 years ago

This is an odd one. I noticed it on ESXi 6.7, and fully patched it

Platform: 6.7.0 Update 2 (Build 13006603) VM: CentOS7 Datastore VMFS version: 6 VM HW version: 14 (fails when upgraded to 4 or downgraded to 13) VMDK#1 - 16GB Thick (but has been thin provisioned before and still fails) VMDK#2 - 16GB Thin (again tried with different provisioning all failing) target: NFS storage on network but also fails when done to local datastore GhettoVCB version - 2021_10_20_1

2022-06-01 20:52:44 -- info: Creating Snapshot "ghettoVCB-snapshot-2022-06-01" for Test1 2022-06-01 20:52:45 -- debug: Waiting for snapshot "ghettoVCB-snapshot-2022-06-01" to be created 2022-06-01 20:52:45 -- debug: Snapshot timeout set to: 900 seconds 2022-06-01 20:52:45 -- debug: findVMDK() - Searching for VMDK: "Test1_1.vmdk" to backup 2022-06-01 20:52:45 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1_1.vmdk" -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_20-52-42/Test1_1.vmdk" Option --adaptertype is deprecated and hence will be ignored Destination disk format: VMFS thin-provisioned Cloning disk '/vmfs/volumes/datastore1/Test1/Test1_1.vmdk'... Clone: 91% done. 2022-06-01 20:52:55 -- debug: findVMDK() - Searching for VMDK: "Test1.vmdk" to backup 2022-06-01 20:52:55 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_20-52-42/Test1.vmdk" 2022-06-01 20:52:55 -- info: ERROR: error in backing up of "/vmfs/volumes/datastore1/Test1/Test1.vmdk" for Test1 2022-06-01 20:52:56 -- info: Removing snapshot from Test1 ... 2022-06-01 20:52:56 -- info: Backup Duration: 12 Seconds 2022-06-01 20:52:56 -- info: ERROR: Unable to backup Test1 due to error in VMDK backup! 2022-06-01 20:52:56 -- debug: VM_OK:0, VM_FAILED:0, VMDK_FAILED:1 2022-06-01 20:52:56 -- info: ###### Final status: ERROR: All VMs experienced at least a partial failure! ######

The strange thing is the backup proceeds fine with a new VM when just one disk, but add another and suddenly the volume that did backup before now fails. I've tested with 2 and 3 disks, they fail.

2022-06-01 21:02:33 -- debug: getVMDKs() - Test1_2.vmdk###16:Test1_1.vmdk###16:Test1.vmdk###16: 2022-06-01 21:02:33 -- info: Initiate backup for Test1 2022-06-01 21:02:34 -- info: Creating Snapshot "ghettoVCB-snapshot-2022-06-01" for Test1 2022-06-01 21:02:34 -- debug: Waiting for snapshot "ghettoVCB-snapshot-2022-06-01" to be created 2022-06-01 21:02:34 -- debug: Snapshot timeout set to: 900 seconds 2022-06-01 21:02:34 -- debug: findVMDK() - Searching for VMDK: "Test1_2.vmdk" to backup 2022-06-01 21:02:34 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1_2.vmdk" -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-02-32/Test1_2.vmdk" 2022-06-01 21:02:35 -- debug: findVMDK() - Searching for VMDK: "Test1_1.vmdk" to backup 2022-06-01 21:02:35 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1_1.vmdk" -a -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-02-32/Test1_1.vmdk" 2022-06-01 21:02:35 -- info: ERROR: error in backing up of "/vmfs/volumes/datastore1/Test1/Test1_1.vmdk" for Test1 2022-06-01 21:02:35 -- debug: findVMDK() - Searching for VMDK: "Test1.vmdk" to backup 2022-06-01 21:02:35 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a -a -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-02-32/Test1.vmdk" Option --adaptertype is deprecated and hence will be ignored Option --adaptertype is deprecated and hence will be ignored Destination disk format: VMFS thin-provisioned Cloning disk '/vmfs/volumes/datastore1/Test1/Test1.vmdk'... Clone: 84% done. 2022-06-01 21:02:58 -- info: Removing snapshot from Test1 ... 2022-06-01 21:02:58 -- debug: Removing /vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-00-06 2022-06-01 21:02:58 -- info: Backup Duration: 24 Seconds 2022-06-01 21:02:58 -- info: ERROR: Unable to backup Test1 due to error in VMDK backup! 2022-06-01 21:02:58 -- debug: VM_OK:0, VM_FAILED:0, VMDK_FAILED:1 2022-06-01 21:02:58 -- info: ###### Final status: ERROR: All VMs experienced at least a partial failure! ######

Was having some problems before, but when I did a manual vmkfstools backup of the individual volumes, things seemed to work just fine. This is very odd.

[root@localhost:/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1] /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1.vmdk" -v 5 Option --adaptertype is deprecated and hence will be ignored DISKLIB-VMFS : "/vmfs/volumes/61bd73fa-5036a36c-d583-001b21a2bb8e/Test1/Test1-flat.vmdk" : open successful (269) size = 17179869184, hd = 0. Type 3 DISKLIB-VMFS : "/vmfs/volumes/61bd73fa-5036a36c-d583-001b21a2bb8e/Test1/Test1-flat.vmdk" : closed. DISKLIB-VMFS : "/vmfs/volumes/61bd73fa-5036a36c-d583-001b21a2bb8e/Test1/Test1-flat.vmdk" : open successful (14) size = 17179869184, hd = 20972519. Type 3 DISKLIB-DSCPTR: Opened [0]: "Test1-flat.vmdk" (0xe) DISKLIB-LINK : Opened '/vmfs/volumes/61bd73fa-5036a36c-d583-001b21a2bb8e/Test1/Test1.vmdk' (0xe): vmfs, 33554432 sectors / 16 GB. DISKLIB-LIB : Opened "/vmfs/volumes/datastore1/Test1/Test1.vmdk" (flags 0xe, type vmfs). Destination disk format: VMFS thin-provisioned Cloning disk '/vmfs/volumes/datastore1/Test1/Test1.vmdk'... DISKLIB-LIB_CREATE : CREATE: "/vmfs/volumes/backups/friday//Test1/Test1.vmdk" -- vmfs capacity=0 (0 bytes) adapter=lsilogic info=cowGran=0 allocType=1 objType=file policy='' DISKLIB-LIB_CLONE : Failed to clone disk using Object Cloning DISKLIB-LIB_CREATE : CREATE: "/vmfs/volumes/backups/friday//Test1/Test1.vmdk" -- vmfs capacity=33554432 (16 GB) adapter=lsilogic info=cowGran=1 allocType=1 objType=file policy='' DISKLIB-LIB_CREATE : CREATE: Creating disk backed by 'file' Clone: 9% done.DISKLIB-DSCPTR: "/vmfs/volumes/backups/friday//Test1/Test1.vmdk" : creation successful. DISKLIB-VMFS : "/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1-flat.vmdk" : open successful (1) size = 17179869184, hd = 0. Type 3 DISKLIB-VMFS : "/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1-flat.vmdk" : closed. DISKLIB-VMFS : "/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1-flat.vmdk" : open successful (520) size = 17179869184, hd = 30409707. Type 3 DISKLIB-DSCPTR: Opened [0]: "Test1-flat.vmdk" (0x208) DISKLIB-LINK : Opened '/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1.vmdk' (0x208): vmfs, 33554432 sectors / 16 GB. DISKLIB-LIB : Opened "/vmfs/volumes/backups/friday//Test1/Test1.vmdk" (flags 0x208, type vmfs). DDB: "longContentID" = "58047802a63cc2fb1969a8848a59d893" (was "c56256fd12ea840f751a9248fffffffe") DISKLIB-LINK : DiskLinkNativeVmfsCloneExisting: Failed to create native clone. (24). DISKLIB-LIB_CLONE : Failed to create native clone on destination handle : The specified feature is not supported by this version (24). OBJLIB-FILEBE : FileBEIoctl: ioctl operation IOCTLCMD_VMFS_XCOPY_TRANSFER_SIZE(3077) failed on '/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1-flat.vmdk' : Inappropriate ioctl for device (1638402) Clone: 100% done.DISKLIB-VMFS : "/vmfs/volumes/a5d6ef71-975f3d86/friday/Test1/Test1-flat.vmdk" : closed. DISKLIB-VMFS : "/vmfs/volumes/61bd73fa-5036a36c-d583-001b21a2bb8e/Test1/Test1-flat.vmdk" : closed. AIOMGR-S : stat o=7 r=12 w=0 i=522 br=196608 bw=0 OBJLIB-LIB: ObjLib cleanup done. WORKER: asyncOps=0 maxActiveOps=0 maxPending=0 maxCompleted=0

But, if you delete the other two disks, leaving only one:

2022-06-01 21:19:04 -- info: Initiate backup for Test1 2022-06-01 21:19:04 -- info: Creating Snapshot "ghettoVCB-snapshot-2022-06-01" for Test1 2022-06-01 21:19:05 -- debug: Waiting for snapshot "ghettoVCB-snapshot-2022-06-01" to be created 2022-06-01 21:19:05 -- debug: Snapshot timeout set to: 900 seconds 2022-06-01 21:19:05 -- debug: findVMDK() - Searching for VMDK: "Test1.vmdk" to backup 2022-06-01 21:19:05 -- debug: /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-19-03/Test1.vmdk" Option --adaptertype is deprecated and hence will be ignored Destination disk format: VMFS thin-provisioned Cloning disk '/vmfs/volumes/datastore1/Test1/Test1.vmdk'... Clone: 84% done. 2022-06-01 21:19:28 -- info: Removing snapshot from Test1 ... 2022-06-01 21:19:28 -- debug: Removing /vmfs/volumes/backups/friday//Test1/Test1-2022-06-01_21-02-32 2022-06-01 21:19:28 -- info: Backup Duration: 24 Seconds 2022-06-01 21:19:28 -- info: Successfully completed backup for Test1! 2022-06-01 21:19:29 -- debug: Storage Information after backup: 2022-06-01 21:19:29 -- debug: SRC_DATASTORE: datastore1 2022-06-01 21:19:29 -- debug: SRC_DATASTORE_CAPACITY: 63488.0 1855.0 GB 2022-06-01 21:19:29 -- debug: SRC_DATASTORE_FREE: 459.3 GB 2022-06-01 21:19:29 -- debug: SRC_DATASTORE_BLOCKSIZE: 1 2022-06-01 21:19:29 -- debug: SRC_DATASTORE_MAX_FILE_SIZE: 256 GB 2022-06-01 21:19:29 -- debug: 2022-06-01 21:19:29 -- debug: DST_DATASTORE: backups 2022-06-01 21:19:29 -- debug: DST_DATASTORE_CAPACITY: 8110.9 16254.9 GB 2022-06-01 21:19:29 -- debug: DST_DATASTORE_FREE: 5975.4 GB 2022-06-01 21:19:29 -- debug: DST_DATASTORE_BLOCKSIZE: NA 2022-06-01 21:19:29 -- debug: DST_DATASTORE_MAX_FILE_SIZE: NA 2022-06-01 21:19:29 -- debug: 2022-06-01 21:19:29 -- debug: VM_OK:1, VM_FAILED:0, VMDK_FAILED:0 2022-06-01 21:19:29 -- info: ###### Final status: All VMs backed up OK! ######

Again, issue occurs when backup target is NFS or local datastore.

thraxarious commented 2 years ago

Verified same setup with VM works fine on 6.5U3, Something in 6.7 makes GhettoVCB fail miserably.

dpalic commented 2 years ago

same issue here I am on 6.7.0 update3

dpalic commented 2 years ago

trying to run with -v 10 see #203

thraxarious commented 2 years ago

Found the problem. in two disk systems on Esxi 6.7, the command passed is malformed.

/sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-08-08_20-24-50/Test1.vmdk" /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-08-08_20-16-39/Test1.vmdk"

an extra -a is added.

It gets even worse when you add a THIRD disk. /sbin/vmkfstools -i "/vmfs/volumes/datastore1/Test1/Test1.vmdk" -a -a -a buslogic -d thin "/vmfs/volumes/backups/friday//Test1/Test1-2022-08-08_20-28-40/Test1.vmdk"

The adapter type function is deprecated. why is this still being passed to esxi?

robina80 commented 1 year ago

is this still the case as i have 6.7u3 and some VMs have more than 3 VMDKs?

thraxarious commented 1 year ago

As outlined in Issue #266 the script functions once the ADAPTER_TYPE entry is removed from the config. Something in the script is not sensing the deprecated call and passing the -a while keeping the second part blank. This erroneously adds up each disk.

For the mean time, remove the ADAPTER_TYPE entry in your configuration and the issue goes away.