Open ethanmye-rs opened 1 year ago
At the moment, MicroCloud won't pick up any partitioned disks. That will definitely change in the near-ish future.
We're close to having support for partitions on local (zfs) storage, but it seems ceph might take a bit longer: https://github.com/canonical/microceph/issues/251
For ZFS, we'll be able to add partition support once https://github.com/canonical/lxd/pull/12537 is merged in LXD.
@masnax WRT to https://github.com/canonical/lxd/pull/12537 why do we need to ascertain if the partition is mounted, as isn't microcloud only showing empty partitions anyway?
Because I couldn't add partitions as local storage during the microcloud init
command, I chose "no" when asked about adding local storage. Microcloud completed initialization completely and it all looks great.
Is there a command I can execute to manually create the local storage pool and add the partitions from the cluster nodes?
At least until this new feature is ready?
Because I couldn't add partitions as local storage during the
microcloud init
command, I chose "no" when asked about adding local storage. Microcloud completed initialization completely and it all looks great.Is there a command I can execute to manually create the local storage pool and add the partitions from the cluster nodes?
At least until this new feature is ready?
Sure, to create a local zfs storage pool like MicroCloud would, you can do the following:
Once on each system:
lxc storage create local zfs source=${disk_path} --target ${cluster_member_name}
And finally, from any system:
lxc storage create local zfs
Thanks for that, extremely helpful!
I noticed per the doco that there are default volumes (backups, images) tied to the target systems.
Are those required, or should I just skip them?
@masnax WRT to canonical/lxd#12537 why do we need to ascertain if the partition is mounted, as isn't microcloud only showing empty partitions anyway?
There's no way MicroCloud can know if the partitions are empty without LXD's super-privileges. So no it will list every single partition on the system. The list is ripped straight from lxd info --resources
.
@masnax I commented over at https://github.com/canonical/lxd/pull/12537#pullrequestreview-1752518459
MicroCeph support for partitions is being tracked here https://github.com/canonical/microceph/issues/251
Sure, to create a local zfs storage pool like MicroCloud would, you can do the following:
Once on each system:
lxc storage create local zfs source=${disk_path} --target ${cluster_member_name}
And finally, from any system:
lxc storage create local zfs
When I follow these instructions to the letter, or even when I add
sudo
, I always get the same error:Error: Failed to run: zpool create -m none -O compression=on local /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3: exit status 1 (invalid vdev specification use '-f' to override the following errors: /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3 is part of active pool 'local')
This is on a TuringPi 2 clusterboard with 4 Turing RK1 nodes (Rockchip 3588 based compute modules with 32GB of eMMC storage). The nodes were freshly imaged and the 3rd partition was newly created on all of them using
parted
before turning on the nodes, to prevent the second partition which is the root partition to grow to the full size of the eMMC storage. What am I missing?
What am I missing?
It looks there already is a storage pool called local
which is using the disk /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3
.
You can run zpool list
on your system to verify this.
@rmbleeker Have you skipped local storage pool setup during microcloud init
?
It looks there already is a storage pool called
local
which is using the disk/dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3
.You can run
zpool list
on your system to verify this.
I realize that's what it looks like, but it's not the case. zpool list
came up empty (no pools available). In fact it still does since the storage pool is still pending and hasn't been created yet.
@rmbleeker Have you skipped local storage pool setup during
microcloud init
?
Yes I have.
Alright, it seems to work when I pick a different approach and slightly alter the commands. I got the idea from the Web UI, which states that when creating a ZFS storage pool, the name of an existing ZFS pool is a valid source. So I created a storage pool with
sudo zpool create -f -m none -O compression=on local /dev/disk/by-id/mmc-BJTD4R_0x5edae852-part3
on each node, filling in the proper disk ID on each node. I then used
sudo lxc storage create local zfs source=local --target=${nodename}
to create the local storage, filling in the name of each node in the cluster as the target. Then finally
sudo lxc storage create local zfs
properly initialized the storage pool, giving it the CREATED stage instead of PENDING or ERRORED. It cost me an extra step which isn't a big deal, but it's still a work around and not a solution in my view.
Out of curiosity, if you have another partition you're able to test on, I'd be very interested to see if the storage pool can be created with a name other than local
?
The setup that eventually worked for you seems to just ignore the existing pool error with the -f
flag. It's not yet clear if this is an issue with existing zpool state or some race when creating the pool in LXD.
There are no other disks or partitions available on the nodes, but since I wasn't far into my project anyway I decided to do some testing and flash the nodes again with a fresh image. I did this twice and set up the cluster again both times. After the first time I used the lxc storage create
commands to create a ZFS storage pool with the partition as it's source, giving it the name local-zfs. This got me the same errors, leaving the storage pool in the ERRORED state. The second time I used zpool create
to first create a pool named local-zfs on the partition, and then used the lxc
commands to use that pool as a source for the storage pool. This worked without using the -f
flag to force overriding an existing pool, except on node 2 where it claimed a pool named local already existed on the partition.
With all that said and done these tests weren' conclusive. The fact that the issue still occurred on node 2 after applying a fresh image leads me to believe that some remnants of the contents of a partition are left behind when you re-create the partition with exactly the same parameters if the storage device isn't properly overwritten beforehand. But apparently that's not always the case because I could create a new pool without forcing it on 3 of the 4 nodes.
In any case I think that perhaps a --force
flag should be implemented for the lxc storage create
command, which is then passed along to the underlying command that is used to create a storage pool, just so you can resolve errors like the one I ran into.
In any case I think that perhaps a --force flag should be implemented for the lxc storage create command
You can already pass source.wipe=true
when creating the storage pool to wipe the source
before trying to create the pool.
In the microcloud init screen, the wizard seems to fail to pickup non-pristine disks. It offers to wipe the disk in the next screen, so I assume this is a bug. If I wipe a non-pristine disk with:
sudo wipefs -a /dev/sdb && sudo dd if=/dev/zero of=/dev/sdb bs=4096 count=100 > /dev/null
then microcloud picks up the disk next time the wizard is run.