dmacvicar / terraform-provider-libvirt

Terraform provider to provision infrastructure with Linux's KVM using libvirt
Apache License 2.0
1.54k stars 457 forks source link

Terraform +libvirt Bridge Network configuration NOT WORKING #1028

Open alexastro opened 10 months ago

alexastro commented 10 months ago

I ran terraform+libvirt to deploy 3x kvm i mad it with nat and now iam trying to have connectivity between the kvms and my home network. For what i saw i had to make some configurations on the host machine:

  1. Create a bridge interface
  2. Bound it to my physical network interface
  3. Assign a availble IP ADDRESS to that bridge interface

Now its just terraform+libvirt configurations. However no mather what i cant achieve this and im really lost on this.

Host machine is Ubuntu Desktop 22.04 LTS.

In this host i only installed libvirt and enbled kvm.

Network Configuration on the host:

-sudo systemctl stop NetworkManager -sudo nano /etc/systemd/network/br.netdev

[NetDev] Name=br0 Kind= bridge


-sudo nano /etc/systemd/network/1-br0-bind-network

[Match] Name=enp2s0 [Network] Bridge=br0

-sudo nano /etc/systemd/network/2-br0-dehcp-network

[Match] Name=br0 [Network] DHCP=ipv4

-sudo systemctl enable systemd-networkd -sudo systemctl restart systemd-networkd

Terraform+libvirt Configuration:

libvirt.tf file

terraform { required_version = ">= 0.13" required_providers { libvirt = { source = "dmacvicar/libvirt" } } }

provider "libvirt" { uri = "qemu:///system" }

create pool

resource "libvirt_pool" "ubuntu-vm" { name = "vm-pool" type = "dir" path = "/libvirt_images/ubuntu-poollll/" }

create image master

resource "libvirt_volume" "image_iso_volume_master" { name = "vm-master-volume.${var.libvirt_volume_format}" pool = libvirt_pool.ubuntu-vm.name source ="${path.module}/iso/jammy-server-cloudimg-amd64.img" format = var.libvirt_volume_format }

resource "libvirt_volume" "disk_volume_master" { name = "vm-master-volume-${count.index}-disk" pool = libvirt_pool.ubuntu-vm.name size = 32949672960 base_volume_id = libvirt_volume.image_iso_volume_master.id count = var.libvirt_number_master }

create image worker

resource "libvirt_volume" "image_iso_volume_worker" { name = "vm-worker-volume.${var.libvirt_volume_format}" pool = libvirt_pool.ubuntu-vm.name source ="${path.module}/iso/jammy-server-cloudimg-amd64.img" format = var.libvirt_volume_format }

resource "libvirt_volume" "disk_volume_worker" { name = "vm-worker-volume-${count.index}-disk" pool = libvirt_pool.ubuntu-vm.name size = 32949672960 base_volume_id = libvirt_volume.image_iso_volume_worker.id count = var.libvirt_number_worker }

Create a network interface

resource "libvirt_network" "kube_network" { autostart = true name = "vm_net" mode = "bridge" addresses = ["192.168.1.9/24"] dhcp { enabled = true } bridge = "br0

}

read the master configuration

data "template_file" "master_data" { template = file("${path.module}/config/master_init.cfg") vars = { IPS = jsonencode(var.ips) } }

read the worker configuration

data "template_file" "worker_data" { template = file("${path.module}/config/worker_init.cfg") }

add cloudinit master disk to pool

resource "libvirt_cloudinit_disk" "commoninit_master" { name = var.libvirt_cloudinit_name_master pool = libvirt_pool.ubuntu-vm.name user_data = data.template_file.master_data.rendered }

add cloudinit worker disk to pool

resource "libvirt_cloudinit_disk" "commoninit_worker" { name = var.libvirt_cloudinit_name_worker pool = libvirt_pool.ubuntu-vm.name user_data = data.template_file.worker_data.rendered }

Creating master vms

resource "libvirt_domain" "vm_master" {

name = "vm-k8s-master-${count.index}" memory = var.libvirt_domain_RAM_memory_master vcpu = var.libvirt_domain_cpu_master

cloudinit = libvirt_cloudinit_disk.commoninit_master.id # set to default libvirt network

network_interface { network_name = libvirt_network.kube_network.name wait_for_lease= true }

console { type = "pty" target_type = "serial" target_port = "0" }

disk { volume_id = libvirt_volume.disk_volume_master[count.index].id }

graphics { type = "spice" listen_type = "address" autoport = true } count = var.libvirt_number_master }

Creating worker vms

resource "libvirt_domain" "vm_worker" {

name = "vm-k8s-worker-${count.index}" memory = var.libvirt_domain_RAM_memory_worker vcpu = var.libvirt_domain_cpu_worker

cloudinit = libvirt_cloudinit_disk.commoninit_worker.id # set to default libvirt network

network_interface { network_name = libvirt_network.kube_network.name wait_for_lease= true }

console { type = "pty" target_type = "serial" target_port = "0" }

disk { volume_id = libvirt_volume.disk_volume_worker[count.index].id }

graphics { type = "spice" listen_type = "address" autoport = true } count = var.libvirt_number_worker }

Outputs

output "ips_master" { value = "${libvirt_domain.vm_master.*.network_interface.0.addresses.0}" }

output "ips_worker" { value = "${libvirt_domain.vm_worker.*.network_interface.0.addresses.0}" }

The cloud init files are normal with a simple user with a default ssh key.

Terraform goes timeout every time i try to run saying it couldnt retrieve any IP on the output! If you guys could tell me whats wrong or a way to get connectivity to the kvmsi woul apreciate, thank you

declanwd commented 9 months ago

I've been looking at using bridge network interfaces recently with the libvirt provider, and I believe you may be missing having qemu guest agent enabled on your VMs. Also your libvirt_domain resources should have qemu_agent = true.

https://wiki.libvirt.org/Qemu_guest_agent.html

mattsn0w commented 9 months ago

Here is a working reference I have used successfully in my homelab. This Ansible playbook will do the Hypervisor OS config with an ubuntu 20.04 or 22.04 base This tf example will run directly on the hypervisor host

alexastro commented 9 months ago

I've been looking at using bridge network interfaces recently with the libvirt provider, and I believe you may be missing having qemu guest agent enabled on your VMs. Also your libvirt_domain resources should have qemu_agent = true.

https://wiki.libvirt.org/Qemu_guest_agent.html

Hey, i already ran the terraform setting the qemu_agent=true and didnt solved

alexastro commented 9 months ago

Here is a working reference I have used successfully in my homelab. This Ansible playbook will do the Hypervisor OS config with an ubuntu 20.04 or 22.04 base This tf example will run directly on the hypervisor host

Thx, i will try that

alexastro commented 9 months ago

Here is a working reference I have used successfully in my homelab. This Ansible playbook will do the Hypervisor OS config with an ubuntu 20.04 or 22.04 base This tf example will run directly on the hypervisor host

Hey! Already tried your tf file and nothing. I used qemu agent and nothing! Used that xsl and network_config.cfg and nothing! Also tried adding in each domain " boot_device { dev = [ "hd", "network"] } " nothing worked.

Im starting to think that it is something really stupid that i am missing.

alexastro commented 9 months ago

I figured out that is not the IP addresses that are not beeing assigned for some reason it is the qemu_agent that dont works properly and isnt able to pull the IP's from the kvm but the machines have the assigned IP

b1kjsh commented 3 months ago

I figured out that is not the IP addresses that are not beeing assigned for some reason it is the qemu_agent that dont works properly and isnt able to pull the IP's from the kvm but the machines have the assigned IP

Did you solve this @alexastro? If so, I'm curious what your solution was.

Pierrotws commented 2 months ago

Hi,

I run into this issue, although the network part is not handled through cloudinit. Interestingly the terraform problem with qemu_agent appears only if cloudinit is setup. Else, it's working fine

cloudinit + qemu_agent=false + network_interface.wait_for_lease=false : OK qemu_agent=true + network_interface.wait_for_lease=true : OK cloudinit + qemu_agent=true + network_interface.wait_for_lease=true : KO

I'm also using bridge configuration.

This is the definition I use:

resource "libvirt_domain" "vm_master" {
  count  = var.master_count
  name   = format("${var.cluster_name}-master%02d", count.index + 1)
  memory = var.master_ram
  cpu {
    mode = var.cpu_mode
  }
  vcpu       = var.master_vcpu
  autostart  = var.autostart
  qemu_agent = true

  cloudinit = element(libvirt_cloudinit_disk.master_init[*].id, count.index)

  network_interface {
    network_id     = libvirt_network.cluster_network.id
    wait_for_lease = true
    hostname       = format("master%02d", count.index + 1)
    addresses      = [cidrhost(var.network_address, count.index + var.master_ip_index + 1)]
  }
....
Pierrotws commented 2 months ago

It looks like the issue is the same encountered here:

https://github.com/dmacvicar/terraform-provider-libvirt/issues/1050

His solution was to disable wait_for_lease, which I cannot afford since I require network_interface.addresses later (in output, to get list of IPs)