canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 929 forks source link

Allow setting `ceph.osd.pool_size` on pool creation #14006

Open masnax opened 2 months ago

masnax commented 2 months ago

At least with MicroCeph, OSD pools will default to a replication size of 3. That means if there are fewer than that many OSDs available for replication, LXD will fail to create the pool when running lxc storage create pool ceph. The command will block as the OSD pool will try to replicate unsuccessfully during initialization.

In such cases (admittedly usually testing setups), it may be useful to specify ceph.osd.pool_size to apply as an initial argument when creating the pool. This may also be useful in the LXD test suite as well, where we need to set microceph.ceph config set global osd_pool_default_size 1 to change the default behaviour for all OSD pools in the test suite to disable replication.

Somewhat related to the MicroCloud OSD replication discussions going on, but MicroCloud would not leverage this capability in LXD because it also needs to be able to update the value when adding new cluster members, and it seems the ceph storage pool has no update implementation either.

tomponline commented 2 months ago

Somewhat related to the MicroCloud OSD replication discussions going on, but MicroCloud would not leverage this capability in LXD because it also needs to be able to update the value when adding new cluster members, and it seems the ceph storage pool has no update implementation either.

@masnax wouldn't it be useful to add such capability rather than working around it in microcloud?

masnax commented 2 months ago

@masnax wouldn't it be useful to add such capability

I thought there might be some reason why it's unimplemented, but if the only reason is that we haven't had a good reason yet, then yes I think implementing updates would be useful too.

rather than working around it in microcloud?

The approach in MicroCloud isn't exactly 1:1 with this because MicroCeph also sets some global ceph configuration (osd_pool_default_size, and mon_allow_pool_size_one) , which is beyond LXD's scope (limited to just OSD pools).

In MicroCloud, we set the replication size equal to the number of OSDs, up to 3. Beyond this point, If OSD pools have a replication factor greater than 3, we should leave them alone because it's a user configuration. But if they have a replication factor less than 3, particularly if it is 1 (no replication), these are dangerous configurations so we should try to increment the value once we see (from MicroCeph) that we have enough OSDs to support it.

initially thought we should make the size setting in MicroCloud limited to just LXD storage pool OSD pools but I'm having doubts about that now. If we have support for replication, then we should try to enable it for all pools, rather than just the LXD pools.

For example, if someone sets up a single node MicroCloud and then creates some OSD pools with Ceph, we should try to enable replication for these OSD pools when adding more OSDs with microcloud add.

However, we need to be able to query the current size of these pools to skip OSD pools that have been manually increased above MicroCloud/MicroCeph's recommendations. (Or manually below the recommendations, which we can infer by the number of OSDs available).

So LXD doesn't cover all the use cases: