dmacvicar / terraform-provider-libvirt

Terraform provider to provision infrastructure with Linux's KVM using libvirt
Apache License 2.0
1.6k stars 459 forks source link

Failed to remove storage pool because of remnant actual volume #1083

Open jordibalcellss opened 7 months ago

jordibalcellss commented 7 months ago

System Information

Linux distribution

AlmaLinux 9.3 (Shamrock Pampas Cat)

Terraform version

Terraform v1.7.5
on linux_amd64

Provider and libvirt versions

v0.7.6


Description of Issue/Question

Setup

terraform {                                                                    
  required_providers {                                                         
    libvirt = {                                                                
      source = "dmacvicar/libvirt"                                             
    }                                                                          
  }                                                                            
}                                                                              

provider "libvirt" {                                                           
  uri = "qemu:///system"                                                       
}                                                                              

resource "libvirt_pool" "disks" {                                              
  name = "disks"                                                               
  type = "dir"                                                                 
  path = "/mnt/kvm/disks"                                                      
}                                                                              

resource "libvirt_volume" "base_volume" {                                      
  name = "base.qcow2"                                                          
  pool = "disks"                                                               
  source = "http://mirror.netzwerge.de/almalinux/9/cloud/x86_64/images/AlmaLinux-9-GenericCloud-9.3-20231113.x86_64.qcow2"
  format = "qcow2"                                                             
}

Steps to Reproduce Issue

Apparently there's a bug related to libvirt_volume which prevents the actual qcow2 file from being removed.

This causes a failure during removal of the storage pool (which comes next during execution of terraform destroy). Removal of libvirt_volume.base_volume times out and the next action fails:

...
libvirt_volume.base_volume: Still destroying... [id=/mnt/kvm/disks/base.qcow2, 4m40s elapsed]
libvirt_volume.base_volume: Still destroying... [id=/mnt/kvm/disks/base.qcow2, 4m50s elapsed]
libvirt_volume.base_volume: Still destroying... [id=/mnt/kvm/disks/base.qcow2, 5m0s elapsed]
╷
│ Error: error deleting storage pool: failed to remove pool '/mnt/kvm/disks': Directory not empty
│ 
│ 
╵
╷
│ Error: error refreshing pool for volume: Requested operation is not valid: storage pool 'disks' is not active

If /mnt/kvm/disks/base.qcow2 is manually removed and terraform destroy rerun, it works. Same for the underlying virsh management:

virsh vol-delete base.qcow2 --pool disks
virsh pool-destroy disks
virsh pool-undefine disks

Additional information:

Do you have SELinux or Apparmor/Firewall enabled? Some special configuration? SELinux enabled Have you tried to reproduce the issue without them enabled? Yes, I did disable SELinux, getting the same result.

scabala commented 2 months ago

Hello, what happens if you specify depends_on on volume resource?

I have a feeling this is not a bug but more like a side effect of Terraform. Because resources are not related, Terraform treats them as equal and tries to delete them at the same time which causes issues because they are related and should be deleted in order.

However, that's just a theory that needs to be verified.