lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.47k stars 198 forks source link

Cluster group rename makes project inconsistent #1147

Closed victoitor closed 1 week ago

victoitor commented 3 weeks ago

Issue description

When renaming a cluster group, projects configured with restricted.cluster.group are not adjusted, leaving projects in an inconsistent state.

Steps to reproduce

  1. Create a cluster group and a project.
  2. Make the project restricted and add restricted.cluster.group to the created group.
  3. Rename the cluster group and try to create an instance on that project.

These steps can be seen in the following commands

$ incus cluster group rename compartilhado amd5700g
Cluster group compartilhado renamed to amd5700g
$ incus project show compartilhado
config:
  features.images: "true"
  features.profiles: "true"
  features.storage.buckets: "true"
  features.storage.volumes: "true"
  limits.instances: "4"
  restricted: "true"
  restricted.backups: allow
  restricted.cluster.groups: compartilhado
  restricted.cluster.target: allow
  restricted.containers.lowlevel: allow
  restricted.containers.nesting: allow
  restricted.devices.disk: allow
  restricted.devices.nic: allow
  restricted.snapshots: allow
description: Experimentos - máquinas compartilhadas
name: compartilhado
$ incus launch images:debian/12/cloud t1 --project compartilhado
Launching t1
Error: Failed instance creation: No suitable cluster member could be found
$ incus project set compartilhado restricted.cluster.groups=amd5700g
$ incus launch images:debian/12/cloud t1 --project compartilhado
Launching t1
$
stgraber commented 3 weeks ago

That's somewhat expected and a common parttern within the Incus API.

We have a few places where we have hard bi-directional dependencies, think things like instance and profiles or cluster member groups and cluster members. In those cases, the API shows it as its own object property, usually a list, rather than through key/value configuration.

Key/value configuration can be validated at time of it being configured and/or at time of use (e.g. instance start), but changing the referred object doesn't update what's referring to it.

Other examples of that would be other project restrictions that refer either to network names or uplink names, but also things like volume names in instance disk devices or profiles, network interface names in those too, ACL names, ...

We did try to be stricter in the past but that came with a number of issues:

stgraber commented 3 weeks ago

Maybe the best approach here is to add a mention of that to the project documentation page. We could also have the placement logic here fail with "Couldn't find cluster group XYZ" rather than just say that it couldn't find somewhere to put an instance.

victoitor commented 3 weeks ago

Maybe the best approach here is to add a mention of that to the project documentation page. We could also have the placement logic here fail with "Couldn't find cluster group XYZ" rather than just say that it couldn't find somewhere to put an instance.

I would consider a very good idea.