xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.31k stars 74 forks source link

ZFS label on install Disk interferes with XCP functions #591

Open GogoFC opened 1 year ago

GogoFC commented 1 year ago

In case someone else has similar problems.

tldr; do "zpool labelclear /dev/sda" before installing XCP so you don't have problems.

I have had a few issues due to zfs label not being cleared on a single disk XCP-ng install. Prior to XCP Ubuntu with root on ZFS was installed.

How do I know that is what caused the issue. I know because after I had cleared the label the problems went away. The other 3 reinstalls did not function properly before I had cleared the label.

I had found a issue here on Github talking about wipefs and to reinstall XCP and the local storage will be available for use, then report that made me think the label was cusing the other problems also and that is how I concluded it was.

Issues:

First 2 reinstalls there was no local storage to install VMs onto. CLI commands gave an error when trying to create a storage using partition 3 on the disk, XOA also could not do it, but XOA could create an ISO storage though.

After I ran wipefs on the disk I saw zfs messages as the GPT was being destroyed. Then I reinstalled and the the local storage was there. I installed a VM just fine. However I installed a Ubuntu just fine but pfSense failed half way via bios install (this might not be related as even later I switched over to uefi, don't remember).

Other issues:

VMs were working but other problems were there. yum repo wouldn't work always, I though it was a newtork error, then it started working, I rebuilt the cache, at first it said there is nothing to build it from and then it worked.

Upon reboot XCP wouldn't reconnect. Hitting the disable and enable button error message would say "pool is already connected" After I did wipefs you were able to reconnect to the server but you needed to disable and enable it. It is one of those things that works sometimes and sometimes it doesn't. I had even gone and cleared everything in redis and started over, then I could connect to the server but only until the next reboot.

So after I cleared the ZFS labels all of the issues went away. Now everything works fine. What's weird about this is that who in the world would ever think that a repo doesn't work because of a zfs label or that the pool won't reconnect for the same reason.

Now after I had reinstalled it properly I went into the other XOA instance and the error message was still there from the last install but after disabling and enabling it the pool connected fine since the IP and password are the same as before.

So wipefs was enough to get my local storage to work but it wasn't enough to fix the repo/network, pool reconnect issues etc.

stormi commented 1 year ago

Thanks. Shouldn't this be a comment for #390. Or is this a different issue?

GogoFC commented 1 year ago

Yes this is also a comment to issue 390 but there are also two different issues here apart from local storage not working.

Whoever issued 390 didn't have those problems or didn't have enough time to notice them. This is a big issue so it deserves it's own "issue" and not to be burried in a sea of comments.

It would be nice if the live installer disk came with zfs so that it can check whether the boot disk was a part of a zfs pool so that it can clear the label, or just clear the label anyway prior to installation because deleting the GPT is not enough. But I don't know how hard that is or if that's too much to ask, I'm not a systems developer so I just left this here comment as a separate issue so people can find it more easily. This isn't something you just think of unless you know it causes a problem. I didn't even think about what I had on that disk prior to installing the XCP, I just noticed weirdness and didn't know why.