xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.32k stars 74 forks source link

No local storage after fresh install #390

Open baz-snow-zz opened 4 years ago

baz-snow-zz commented 4 years ago

I reinstalled like 4 or 5 times think I might of done somet thing wrong even tried different dive even 2 drives, every time I log into xcp - center on windows, I look and there no Local drive, I have search the internet others have had similar issues but no where have I seen a way to fix it.. When I do fdisk -l I can see the its using the Drive there x6 partitions on the Samsung 240Gb SSD there a 191Gb Linux LVM partition... I tried both LVM and ext install same thing..

Trying to install version 8.1 xcp-ng every thing seam to install fine but no local drive... i would appreciate any help you can give me ..

nagilum99 commented 4 years ago

You could manually try to fix it, if you search for xe sr-introduce or xe sr-create... But the root-cause for the partition, not appearing, would probably still be interesting.

There, btw. aren't any "drives" listet, they're called "storage" but probably you just mixed it up. Though: Be as precise as possible with the definitions as it can quickly be confusing what's exactly missing.

As you write LVM: IMHO it doesn't make any sense to use LVM on local storage SSDs. You could/should try to install it with ext aka thin provisioning.

It's not suitable for shared storage, nor a good idea if you have several growing VMs on spinning disks as it may end up in heavy defragmentation, but for SSDs it's only a plus, as you save lots of space when making snapshots etc.

Someone might still be interested in logfiles, so if it's not urgent, you could wait a bit to see if someone wants to see them and try to find the problem.

baz-snow-zz commented 4 years ago

I don't want To use LVM like I said I tried both installs make NO difference.. I want thin provisioning I was just stating the fact that the info I was given while issuing a fdisk -l command I did nt mix any thing up there NO Local Drive,

and trying again the only thing I noticed was that when loading it stop and take a while and on the screen it says 2.093407 efi: efi_memmap is not enabled.

I even took the drives out formatted them clean and tried to install, I tired 1 drive different drive, 2 drives.. I even watch like 5 different video on the install to see if I was doing some thing wrong ..

here a pic of the command centre. xcp-ng

stormi commented 4 years ago

As @nagilum99 said there is no "local drive" concept, so can you describe precisely how this "no local drive" issue of yours manifests itself?

baz-snow-zz commented 4 years ago

It manifests itself from trying to install.. after The install is complete.. I load xcp-ng centre you can see it the pictures there's no local drives..

I first watched this video to install https://www.youtube.com/watch?v=bG5enpij0e8

At this time stamp https://youtu.be/bG5enpij0e8?t=685 as he logs into the server under his server list it show 3 items DVD drives, Local storage, removable storage ...

I'm not getting the same results no mater how many times or ways that I install it there is NO Local storage..

I mean I'm not sure how I can explain it in any other way I mean if you want logs tells which I while post them, want me to run commands sure, tell me which ones..

baz-snow-zz commented 4 years ago

Then I check whit other video thinking maybe some thing changed in version

stormi commented 4 years ago

Ok, so your issue is indeed local storage missing after installation. There may be interesting errors in /var/log/installer/install-log.

baz-snow-zz commented 4 years ago

ok maybe Local drive was the wrong word or term, syntax ... Local Storage Ok I got frustrated trying so removed it, I will have to reinstall and get that log.

HeMaN-NL commented 4 years ago

what options did you choose in the "Virtual Machine Storage" part of the setup (6:55 in the video)?

baz-snow-zz commented 4 years ago

I tired all the setting but EXT was the one I was going for. Computer always seam to aggravate me and amaze me all at The same time... I ordered a 4 port 1Gb Network card for this server for PfSense well I moved the Video card to the second PCI slot and put the network card in the first slot... I pug in the USB stick booted up installed and now to my surprise, I now have a Local Storage.. !!! I have no idea why maybe some conflict but it's the only thing I change since all the other attempts.. But I am curious to put it back and check the install logs and see what was causing this issue.. If you want the data for the issue I can post it others wise I guess this issue is considered closed

nagilum99 commented 4 years ago

If you can reproduce it by changing the order, it should be interesting for the devs to see what happens in 'bad case'. Ideally deliver the requested file in both cases: Working and not working, so one can see the difference.

For good developers a "I don't know why but it's working now, can be close" is unsatisfying, as things that are solved for unknown reason, might reoccur for the same.

kedare commented 3 years ago

Hello, I am facing the same issue, fresh install of XCP-NG 8.2.0, I select Ext local storage in the installer. After restarting I don't see any storage available.

I do see things on the installation logs that it was supposed to be create but no errors :

INFO     [2021-09-10 07:53:24] DISPATCH: Updated state: target-boot-mode -> uefi; backup-partnum -> 2; swap-partnum -> 6; boot-partnum -> 4; storage-partnum -> 3; primary-partnum -> 1; logs-partnum -> 5

NFO     [2021-09-10 08:11:36] TASK: Evaluating <function prepareStorageRepositories at 0x7f4a71771b90>[{'boot': '/tmp/root/boot', 'root': '/tmp/root', 'logs': '/tmp/root/var/log', 'esp': '/tmp/root/boot/efi'}, '/
dev/sda', 3, ['/dev/sda'], 'ext']
INFO     [2021-09-10 08:11:36] Arranging for storage repositories to be created at first boot...
INFO     [2021-09-10 08:11:36] ran ['/sbin/udevadm', 'info', '-q', 'symlink', '-n', '/dev/sda3']; rc 0

But nothing after that:

[10:41 frnte1-xcp1 ~]# xe sr-list

uuid ( RO)                : cc4b80e0-0295-325d-37a5-32633818037f
          name-label ( RW): XCP-ng Tools
    name-description ( RW): XCP-ng Tools ISOs
                host ( RO): frnte1-xcp1
                type ( RO): iso
        content-type ( RO): iso

This is the partition table of the local disk, if I understand the logs, the locale storage should be in /dev/sda3 (the big LVM partition), so I can just to a sr-create with /dev/sda3 as target to activate it ?

#         Start          End    Size  Type            Name
 1     46139392     83888127     18G  Microsoft basic
 2      8390656     46139391     18G  Microsoft basic
 3     87033856    488397134  191.4G  Linux LVM
 4     83888128     84936703    512M  EFI System
 5         2048      8390655      4G  Microsoft basic
 6     84936704     87033855      1G  Linux swap
kedare commented 3 years ago

I suspect it was related to the disk being previously used for ZFS.

bitosaur commented 3 years ago

I have same issue, and drive was used as ZFS right before.

kedare commented 3 years ago

A fix I found was to run the wipefs command on all the disks (this will delete all the disk FS metadata "magic strings", making the disks seen as blank) and the reinstalling XCP-NG, apparently XCP-NG is not clearing the disks properly.

stormi commented 3 years ago

This is very likely a bug that is also present in Citrix Hypervisor, so it would be nice if someone could reproduce on it and then report it on https://bugs.xenserver.org

If you want to also help debugging/fixing it:

There's a third option that would allow you to modify the installer code "live" before starting the installation process, but it's not documented at the moment (basically, you start in "shell" mode then modify the files then start the installer).

mcarter960 commented 3 years ago

kedare's fix worked for me. Then I reinstalled. I included all out put below...

[12:28 xcp-ng-oufiaktx ~] fdisk -l

Disk /dev/sda: 240.1 GB, 240057409536 bytes, 468862128 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/sdb: 240.1 GB, 240065183744 bytes, 468877312 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes

WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdc: 60.0 GB, 60022480896 bytes, 117231408 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: gpt Disk identifier: DE033F5F-A08F-4AFE-AFAD-2800CB1CB8C6

     Start          End    Size  Type            Name

1 46139392 83888127 18G Microsoft basic 2 8390656 46139391 18G Microsoft basic 3 83888128 84936703 512M EFI System
5 2048 8390655 4G Microsoft basic 6 84936704 87033855 1G Linux swap

wipefs /dev/sda offset type

0x37e483f000 zfs_member [filesystem]

wipefs /dev/sdb
offset type

0x218 LVM2_member [raid] UUID: 4XPmNd-nRUF-6iV0-s80W-V4cd-rtXT-DF2xh9

wipefs --all --force /dev/sda /dev/sda: 8 bytes were erased at offset 0x37e483f000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e483e000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e483d000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e483c000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e483b000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e483a000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4839000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4838000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4837000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4836000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4835000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4834000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4833000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487f000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487e000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487d000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487c000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487b000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e487a000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4879000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4878000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4877000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4876000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4875000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4874000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4873000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4872000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4871000 (zfs_member): 0c b1 ba 00 00 00 00 00 /dev/sda: 8 bytes were erased at offset 0x37e4870000 (zfs_member): 0c b1 ba 00 00 00 00 00 wipefs --all --force /dev/sdb /dev/sdb: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31

GogoFC commented 1 year ago

Thanks I had the same thing with a disk that used to be part of a ZFS pool. There was no local storage and couldn't be added manually, wipefs fixed it.

Update:

After wipefs device, reinstalled XCP and storage was there, worked fine, but there were other very weird things that happened like the host/pool woldn't reconnect XOA kept saying the pool is already connected, then I cleared REDIS from XOA and then it connected fine but it wouldn't autoreconnect on reboot, I had to manually disable and enable for it to reconnect, but at least this time it worked.

I now have run zpool labelclear /dev/device and I found that wipefs didn't actually clear the zpool label, and I can't say for sure that it interferes with anything just that I had some weird behaviour.

I'm reinstalling it now again after gdisk wipe and zpool labelclear.

Turns out labelclear was also necesarry.

ydirson commented 1 year ago

Something that could turn out useful to understand what happens on such installs that previously used ZFS, would be:

  1. detail what was the specific ZFS setup (how many ZFS pools, using which disks/partitions?)
  2. provide the output of blkid from dom0, so we can see if and how the system really still considers old ZFS pools to be available
ydirson commented 1 year ago

Just did a quick test, installed Ubuntu 22.04 with ZFS, and xcp-ng 8.2.1 over that. Clearly there are ZFS leftovers that should not be there:

[17:41 xcp-ng-eesiapsl ~]# blkid
/dev/nvme0n1: LABEL="rpool" UUID="16159335333546585524" UUID_SUB="11615244974555223893" TYPE="zfs_member" PTTYPE="gpt" 
/dev/nvme0n1p1: LABEL="root-bfqghe" UUID="8dfcea78-bd8b-48c7-a3c5-d5870db9d2e5" TYPE="ext3" PARTUUID="0ea844f1-653c-48c6-bef0-cf2abefe4b79" 
/dev/nvme0n1p3: LABEL="rpool" UUID="16159335333546585524" UUID_SUB="11615244974555223893" TYPE="zfs_member" PARTUUID="ab5417b2-0676-4baa-97ec-18e73a47103e" 
/dev/nvme0n1p4: SEC_TYPE="msdos" LABEL="BOOT-BFQGHE" UUID="AF02-C79B" TYPE="vfat" PARTUUID="842869da-7360-4230-bc4b-5fd73bfc6b01" 
/dev/nvme0n1p5: LABEL="logs-bfqghe" UUID="3257578c-a893-42d3-8e67-4d234dd81a74" TYPE="ext3" PARTUUID="a2875852-f4c1-442b-9753-cf36c6b5425d" 
/dev/nvme0n1p6: LABEL="swap-bfqghe" UUID="16ea0d7d-d95f-4a4c-a6e9-b7162d070def" TYPE="swap" PARTUUID="d1863ed7-0360-4db3-b2d2-e8ef87965c1d" 
/dev/nvme0n1p2: PARTUUID="2c5bd6c0-6f6c-40ab-99c4-3b2b909c6111" 

Then we can see that the first-boot setup of local storage did not proceed:

[17:42 xcp-ng-eesiapsl ~]# ls /var/lib/misc/
ran-control-domain-params-init  ran-create-guest-templates  ran-generate-iscsi-iqn  ran-network-init

... where the lack of ran-storage-init is telling.

Some more details can be seen in daemon.log (here for a LVM SR):

Feb  9 17:33:55 xcp-ng-eesiapsl storage-init[2295]: Error code: SR_BACKEND_FAILURE_77
Feb  9 17:33:55 xcp-ng-eesiapsl storage-init[2295]: Error parameters: , Logical Volume group creation failed,
Feb  9 17:33:55 xcp-ng-eesiapsl storage-init[2295]: SR creation failed.
Feb  9 17:33:55 xcp-ng-eesiapsl systemd[1]: storage-init.service: main process exited, code=exited, status=1/FAILURE
Feb  9 17:33:55 xcp-ng-eesiapsl systemd[1]: Failed to start Initialize host storage during first boot.
Feb  9 17:33:55 xcp-ng-eesiapsl systemd[1]: Unit storage-init.service entered failed state.
Feb  9 17:33:55 xcp-ng-eesiapsl systemd[1]: storage-init.service failed.

All we have here is "Logical Volume group creation failed", to get more details we must dig in SMlog, where we find:

Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] FAILED in util.pread: (rc 5) stdout: '', stderr: '  Volume group "VG_XenStorage-4aec7463-c105-fccd-0276-5679dfd9db60" not found

That is, the first error logged is about lack of a VG, whereas pvs does not even report a PV (and then how this VG name is even crafted in these conditions is a bit obscure to me at first). Possibly the LVM SR could do better at logging and error handling ? Since the EXT backend also relies on LVM volumes, it will have hit a similar problem. Digging further...

Edit: the explanation is indeed a few lines below:

Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] ['/sbin/vgcreate', '--metadatasize', '10M', 'VG_XenStorage-4aec7463-c105-fccd-0276-5679dfd9db60', '/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_nvme0-part3']
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] FAILED in util.pread: (rc 5) stdout: '', stderr: 'WARNING: zfs_member signature detected on /dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_nvme0-part3 at offset 62812319744. Wipe it? [y/n]: [n]
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563]   Aborted wiping of zfs_member.
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563]   1 existing signature left on the device.
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] '
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] lock: released /var/lock/sm/.nil/lvm
Feb  9 17:33:55 xcp-ng-eesiapsl SM: [2563] Raising exception [77, Logical Volume group creation failed]

The LVM tools indeed attempt to protect from unwanted clobbering of ZFS members, and aborts. This confirms the diagnostics made by @baz-snow-zz.

It is likely a wipefs --all on all old partitions before repartitioning would help a lot here. Looking into that.

GogoFC commented 1 year ago

I was about to reinstall Ubuntu with root on ZFS and then XCP again, but then I saw you had already done that. wipefs -all did help alot, helped get a local storage, but other problems persisted until I cleared the zfs label (the problems include xcp not automatically wanting to reconnect after reboot, and reconnect at all, it kept saying pool is already connected).