dmacvicar / terraform-provider-libvirt

Terraform provider to provision infrastructure with Linux's KVM using libvirt
Apache License 2.0
1.6k stars 458 forks source link

Not possible to leave network card without ip address #502

Closed dmacvicar closed 5 years ago

dmacvicar commented 5 years ago

Full example

provider "libvirt" {
    uri = "qemu:///system"
}

resource "libvirt_volume" "my2_qcow2" {
  name = "my2_qcow2-${count.index}"
  count = 2
  pool = "default"
  source = "/home/some-image.qcow2"
  format = "qcow2"
}

resource "libvirt_network" "vm_network" {
   name = "inet"
   addresses = ["10.0.1.0/24"]
   dhcp {
        enabled = false
    }
}

resource "libvirt_domain" "domain-sle" {
  name = "sle-terraform-${count.index}"
  memory = "512"
  vcpu = 1
  count = 2 

  network_interface {
    network_id = "${libvirt_network.vm_network.id}"
  }

  network_interface {
    network_id = "${libvirt_network.vm_network.id}"
  }

  console {
    type        = "pty"
    target_port = "0"
    target_type = "serial"
  }

  console {
      type        = "pty"
      target_type = "virtio"
      target_port = "1"
  }

  disk {
   volume_id = "${libvirt_volume.my2_qcow2.*.id[count.index]}"
  }

  graphics {
    type = "spice"
    listen_type = "address"
    autoport = "true"
  }
}

Details

Given network:

resource "libvirt_network" "vm_network" {
   name = "inet"
   addresses = ["10.0.1.0/24"]
   dhcp {
        enabled = false
    }
}

And network interface:

  network_interface {
    network_id = "${libvirt_network.vm_network.id}"
  }

It is not possible to leave the IP address unconfigured, which is a common case for QA teams.

libvirt_domain.domain-leap15: Cannot map 'leap15-terraform': we are not waiting for DHCP lease and no IP has been provided

This could be a check/leftover from the initial times, where the provider tried to emulate a cloud-like environment and always have a way to access the domain, however, this may not be necessary.

asmorodskyi commented 5 years ago

this is problematic code from libvirt/domain.go :

// no IPs provided: if the hostname has been provided, wait until we get an IP
wait := false
for _, iface := range *waitForLeases {
        if iface == &netIface {
                   wait = true
                    break
         }
}
        if !wait {
        return fmt.Errorf("Cannot map '%s': we are not waiting for DHCP lease and no IP has been provided", hostname)
 }

getting to waitForLeases is controlled by

if waitForLease, ok := d.GetOk(prefix + ".wait_for_lease"); ok {
        if waitForLease.(bool) {
               *waitForLeases = append(*waitForLeases, &netIface)
        }
} 

from what I have learn about go in last days :

network_interface {
    network_id = "${libvirt_network.vm_network.id}"
  }

should not reach waitForLeases collection , but it reaching it what I have missed ?

asmorodskyi commented 5 years ago

@MalloZup I would like to contribute by fixing this issue. But unfortunately I stuck on point which I mention above. I would appreciate much if you would help me to finish it even if it is easier for you just silently fix it :)

MalloZup commented 5 years ago

@asmorodskyi currently i still don't get your issue and i'm trying to reproduce.

Remember that it is an opensource and community project here, and also in day by day business i am busy with other stuff. I try always to help people around and silently fixing without giving explication is not my style.

As i say before, i don't know how to reproduce and dunno what is your problem, so i'm taking the time to see where is the problem , and see if i can reproduce it :sunflower:

Once i got the picture we can help fixing the problem. For me if it is a small/medium fix i will be glad to help you on fixing it. If it is an huge tech-debt that need some really advanced knowledge, i think you will understand that we fix ourself. But in anyway, all process here is opensource, you will see the fix and gain knowledge by that. So let see. :sun_with_face: THx for helping and motivation

asmorodskyi commented 5 years ago

@MalloZup didn't plan to push you anyhow take your time and get to the issue whenever it will suit your priorities list I have no problems with that. Meanwhile I will try to proceed on my own. In case you have any certain question I am ready to answer. Some background for this issue : we trying to use terraform to setup environment which we plan to use for testing Network stack. So we need to have VMs which have NICs but NICs not initialized nor with static IPs neither with DHCP. But looks like such setup not supported by libvirt provider. Currently we workaround this with using 0.0.0.0 IP and after short discussion with @dmacvicar he open this issue for us to find proper solution for that

MalloZup commented 5 years ago

@asmorodskyi no worry, it is ok! thank you for feedback and infos. :cupid:

MalloZup commented 5 years ago

@asmorodskyi ok i see the problem.

What i would try if i were you: 1) remove the if block https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L663-L665 2) rebuild the provider with these changes on your $GOPATH: make, make install. This will generate a binary. Move the binary terraform-provider-libvirt in /usr/bin, removing the old provider.

3) After 2 i would rerun the example with the new provider, and see if it fails.

Looking at the codebase shortly, i think we could remove now that IF. But i would give a try, to see if we have other failures

inercia commented 5 years ago

@asmorodskyi As @MalloZup mentioned, I think the key would be to replace the https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L663-L674 block by a if wait { partialNetIfaces[strings.ToUpper(mac)] = ...} (so we would add the interface to the list of partially resolved interfaces if and only if we have the wait set). In this way, we would not force users provide IPs or wait for DHCP: we could also just ignore the IPs...

asmorodskyi commented 5 years ago

@MalloZup I deleted more then you suggesting https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L655-L665 because go complains about wait variable which was declared but never used. after compiling and running terraform with modified version of provider I got some different error :

Error applying plan:

2 error(s) occurred:

* libvirt_domain.domain-sle[1]: index 1 out of range for list libvirt_volume.my2_qcow2.*.id (max 1) in:

${libvirt_volume.my2_qcow2.*.id[count.index]}
* libvirt_domain.domain-sle[0]: 1 error(s) occurred:

* libvirt_domain.domain-sle.0: Did not obtain the IP address for MAC=06:4F:42:C3:FD:D6
  1. not sure why error about qcow image popup and if it is related. before changes I didn't have any issues with libvirt_volume domain
  2. anyway removing block that you mention just make this problem popup in different place , so I would like to figure out why NIC without waitForLease defined getting into *waitForLeases here https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L604-L608 instead of firefighting of what happening afterwards when it gets there.

@inercia sorry but I am not experienced enough to understand what you meant under "..." or go supports exactly such string ? Nevertheless not sure how it could help if we failing before lines which you suggesting to change https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L664

inercia commented 5 years ago

@inercia sorry but I am not experienced enough to understand what you meant under "..." or go supports exactly such string ? Nevertheless not sure how it could help if we failing before lines which you suggesting to change https://github.com/dmacvicar/terraform-provider-libvirt/blob/master/libvirt/domain.go#L664

Sorry, I was suggesting to do something like:

                               if wait {
                    // the resource specifies a hostname but not an IP, so we must wait until we
                    // have a valid lease and then read the IP we have been assigned, so we can
                    // do the mapping
                    log.Printf("[DEBUG] Do not have an IP for '%s' yet: will wait until DHCP provides one...", hostname)
                    partialNetIfaces[strings.ToUpper(mac)] = &pendingMapping{
                        mac:      strings.ToUpper(mac),
                        hostname: hostname,
                        network:  network,
                    }
                               }
asmorodskyi commented 5 years ago

to investigate qcow error i did the following :

  1. delete network domain from config at all so it will look like this :
    
    provider "libvirt" {
    uri = "qemu:///system"
    }

resource "libvirt_volume" "my2_qcow2" { name = "my2_qcow2-${count.index}" count = 2 pool = "default" source = "/home/some-image.qcow2" format = "qcow2" }

resource "libvirt_domain" "domain-sle" { name = "sle-terraform-${count.index}" memory = "512" vcpu = 1 count = 2

console { type = "pty" target_port = "0" target_type = "serial" }

console { type = "pty" target_type = "virtio" target_port = "1" }

disk { volume_id = "${libvirt_volume.my2_qcow2.*.id[count.index]}" }

graphics { type = "spice" listen_type = "address" autoport = "true" } }


2. Run this config with master HEAD version of provider => no errors 
3. Run this config with modified version with suggestions by @MalloZup => no errors 

So for me it looks like something which requires investigation too no related  to current issue. I mean it shouldn't the case that problems in one domain cause errors in another, right ? 
MalloZup commented 5 years ago

@asmorodskyi your snippet of terraform without network Interface is ok that you don't have any error, if you remove the networkIface , resource from hcl you will not have any error.

The comment from @inercia to me make sense. see (https://github.com/dmacvicar/terraform-provider-libvirt/issues/502#issuecomment-449385214)

With that modified codebase, if i were you, i would build the provider and then retry this tf: https://github.com/dmacvicar/terraform-provider-libvirt/issues/502#issue-391734919

note that the networkInterface is present on hcl.

:santa: :mrs_claus: :three: :gift: :santa:

asmorodskyi commented 5 years ago

@MalloZup I just tried code suggested by @inercia with initial tf -

2 error(s) occurred:

* libvirt_domain.domain-sle[1]: 1 error(s) occurred:

* libvirt_domain.domain-sle.1: Did not obtain the IP address for MAC=06:E9:9D:19:CF:9A
* libvirt_domain.domain-sle[0]: 1 error(s) occurred:

* libvirt_domain.domain-sle.0: Did not obtain the IP address for MAC=76:8F:72:5C:EA:3B
asmorodskyi commented 5 years ago

after spending more time on reading a code I found that we need to clarify issue description :

Not possible to leave network card without ip address when network mode equal nat

when I add mode = "none" all works as expected no errors and domain created successfully. So issue popup only for nat and route modes ( and also when nothing is defined which treated as default nat ). More code investigation needed how actually NAT is created ? Maybe it really need to have IP address on guest to setup link between host and guest ?

MalloZup commented 5 years ago

fixed and works on latest master thx

asmorodskyi commented 5 years ago

@MalloZup so just to clarify for myself - my assumption was wrong and to create network with mode=nat you don't need any IP ? Also can you please advise me what release model libvirt-provider has ? I mean when I will see released version which will contain this fix ? Want to test it

MalloZup commented 5 years ago

you can build from source the master branch. fix pr #556. for the released version we need to do other things before releasing a new version.