canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 929 forks source link

Add locations to network GET #14406

Open edlerd opened 2 weeks ago

edlerd commented 2 weeks ago

Required information

Issue description

When fetching networks, we need the information on which cluster member they exist on. The current GET 1.0/networks?project=:project&recursion=1 response includes a locations field. The field is correctly populated for managed networks.

However, for non-managed networks, the locations field is always null. This seems wrong, because a physical network can exist only on some members, as per the reproducer below.

Steps to reproduce

  1. Initialize a 3 node microcluster to get a clustered LXD
  2. On one of the cluster members, create a physical network. In a shell, run: sudo ip link add eth10 type dummy
  3. Fetch LXD network list via API GET 1.0/networks?project=:project&recursion=1.
  4. The newly created eth10 appears in the list with locations: null.

Expected output would be a locations: [ "member-name" ] value for the eth10 network. Because it only exists on exactly one cluster member and not on the others.

tomponline commented 2 weeks ago

Expected output would be a locations: [ "member-name" ] value for the eth10 network. Because it only exists on exactly one cluster member and not on the others.

I suspect we will need to do something different so you can see the different IPs on each cluster member for yje same interface name.

edlerd commented 2 weeks ago

I suspect we will need to do something different so you can see the different IPs on each cluster member for yje same interface name.

I don't see IPs on the network list. See screen below

image

tomponline commented 2 weeks ago

OK thanks, looking at the implementation I can see its only returning the configured IPs from the DB, not from the interface itself.

So it may be OK to just populate the locations field for unmanaged networks and if we ever want to return IP info for those, then we'll have to also add support for per-member target param, as IPs would be per-member info.

tomponline commented 2 weeks ago

I also note that this logic from /1.0/networks/<network> is missing from /1.0/networks:

https://github.com/canonical/lxd/blob/efae303214c50fdc9ff1717753e34666e0bede8e/lxd/networks.go#L861-L866

edlerd commented 2 weeks ago

When running sudo ip link add eth10 type dummy on one of the cluster members, the new dummy eth10 interface only shows up in the GET 1.0/networks response, if the API request is sent to the member that the interface was created on. When sending the API request to another cluster member, the interface does not show up.

So we might need a GET 1.0/networks?target=:member filter to get the list of one specific cluster member. Or plan B is to have GET 1.0/networks collect the list of networks from all cluster members.

edlerd commented 2 weeks ago

After contemplating on this issue and reading a bit of lxd source, I think adding a target filter to the GET 1.0/networks endpoint is the easiest way to resolve this. See the linked PR #14419 above, implementing it.

The other proposed solution, to populate the target field for non-managed networks, is way more expensive. Because the cluster member handling the API request would have to reach out to all other members. This seems costly, complex and more error-prone (thinking of split brain or partially down cluster members). With the added target filter, we don't need to populate the location field as far as the LXD-UI is concerned and can resolve this issue with the simpler solution.

markylaing commented 2 weeks ago

After contemplating on this issue and reading a bit of lxd source, I think adding a target filter to the GET 1.0/networks endpoint is the easiest way to resolve this. See the linked PR #14419 above, implementing it.

The other proposed solution, to populate the target field for non-managed networks, is way more expensive. Because the cluster member handling the API request would have to reach out to all other members. This seems costly, complex and more error-prone (thinking of split brain or partially down cluster members). With the added target filter, we don't need to populate the location field as far as the LXD-UI is concerned and can resolve this issue with the simpler solution.

I agree that it's more complex and costly to reach out to other cluster members to populate non-managed networks but I think it's necessary in this case for consistency in API responses. For example, if I have a LXD cluster and a remote configured in lxc for it. I would expect to see all networks in the cluster on a call to GET /1.0/networks. This is the same as in storage volumes/buckets where there is no target parameter on GET /1.0/storage-pools/{pool}/volumes/{type}/{name}.

markylaing commented 2 weeks ago

Just to expand on this a bit. My main contention is that although the UI will know to call GET /1.0/networks?target={name} for all cluster members, I wouldn't expect a CLI user to call lxc network list --target {name} against all cluster members to get the full picture of all networks in the cluster.

tomponline commented 2 weeks ago

Agree

edlerd commented 1 week ago

I also agree with the more involved correction of the network api long term. I think we discussed the target filter as well before, which #14419 introduces. Can we merge that right away? It unblocks the UI work by providing a way to fetch the list of available interfaces per member.