dmacvicar / terraform-provider-libvirt

Terraform provider to provision infrastructure with Linux's KVM using libvirt
Apache License 2.0
1.59k stars 458 forks source link

Force creation of network when we can't update due to libvirt limitation #788

Closed MalloZup closed 3 years ago

MalloZup commented 4 years ago

Edited by @dmacvicar

this PR fixes issue #784

We are passing libvirt.NETWORK_UPDATE_COMMAND_MODIFY with an empty element to remove an element, when the right thing to do should be to pass ibvirt.NETWORK_UPDATE_COMMAND_DELETE and passing the actual element. This creates the error about the empty XML element.

However, it is not possible to update this attribute anyway (source), so the right fix is to ForceNew

dmacvicar commented 4 years ago

Can you explain what problem is being solved, in the PR?

MalloZup commented 4 years ago

@dmacvicar yop sure. In (https://github.com/dmacvicar/terraform-provider-libvirt/issues/784) we do have a scenario where user do following:

- means he is removing the attribute after network creation (2appy)

Resource "libvirt_network" "app-real" {
   name = "${var.cluster_name}-app-real"
   autostart = true
   mode = "none"
-  domain = "app-real"
   addresses = ["123.456.789.0/24"]
   dhcp {
     enabled = true
   }
   dns {
     enabled = false
   }
 }

Steps to Reproduce Issue

1) Create libvirt_network resource with domain set apply terraform 2) Then remove the domain attr from terraform and apply terraform.

Errors:

Error: Error when updating domain in herring-app-real: virError(Code=35, Domain=19, Message='network_update_xml:1: Document is empty

I could also reproduce locally.

There are 2 issues.

Both doesn't work

The terraform code detect that detect the change is called correctly but the libvirt API call network.update doesn't update the network

https://github.com/dmacvicar/terraform-provider-libvirt/blob/d1cf93cdb066b2d3d2c2c046d2fb986808d9a424/libvirt/resource_libvirt_network.go#L357

So assuming that even I missed something when doing the call, and assuming that network.Update could work on this,

we do have some gray area in the provider, where the user need

-->

https://libvirt.org/html/libvirt-libvirt-network.html#virNetworkUpdate

Even if in this API there is a section for updating the Domain, https://libvirt.org/html/libvirt-libvirt-network.html#virNetworkUpdateSection . I could after some coding not update the value of an existing Network . ( I tried with different flags also https://libvirt.org/html/libvirt-libvirt-network.html#virNetworkUpdateCommand) etc without sucess.

I did also the test with native libvirt with virsh and it could not update the network.

I checked the upstream doc and found out that they don't document the domain as possible section that can be updated. https://wiki.libvirt.org/page/Networking#virsh_net-update

They state also about "Arbitrary changes to the network" ( see page) and also

The config items in a network that can be changed with virsh net-update are: ip-dhcp-host ip-dhcp-range (add/delete only, no modify) forward-interface (add/delete only) portgroup dns-host dns-txt dns-srv I assumed that net-update is calling same API as network.Update

So that is mostly why I do think that the network.update has something faulty on this, especially the domain update sections

I will research more on the evidence in the source code.

MalloZup commented 4 years ago

I found also out: https://www.redhat.com/archives/libvirt-users/2010-June/msg00044.html

Libvirt doesn't support changing the configuration of a running network,
so any changes will require a network restart with net-destroy;
net-start. This should be documented better

https://serverfault.com/a/573255

This might be also the reason why I could not change the domain via network.Update call because I didn't restart the network.

Imho recreate resources via ForceNew is the cleanest solution.

I do think handling a network udpdate can be tricky, since we might need to hook perform all this virsh destroy etc machinery via API operation after an network.update, and all this take time, and can fail.

But open to other ideas and feedback! :flower_playing_cards:

Imho, we should not handle the update via update but just recreating new resources.


going further.. I do think that this approach of Updating a resource should be extended to all resource we manage and where we see issues with this. (https://github.com/dmacvicar/terraform-provider-libvirt/issues/787)

From the UX perspective an user when updating a Resource via terraform, expect that the change is made in place, not having to dealing with the underneath hypervisor API for fixing the original limitation of the cloud/virt provider

MalloZup commented 3 years ago

@dmacvicar thx for your comments and findings. I agree about the testcase if we can find one.

I do actually have tested locally, but I was not precise on documenting my findings with the source code and justifying them.

I will follow up with some PRs because we do need to force resource for network and create a readme.md update on the design.

If you want to take over on this let me know otherwise I will just do it once I do have free time

dmacvicar commented 3 years ago

I have added a note in the source code already.

I don't think there is anything left to do here.