Open mal opened 4 years ago
azurerm_virtual_machine_data_disk_attachment which only attaches data disks to an existing VM post-boot
Oh, that is really unfortunate... I wish I could try this but I'm not even able to create a managed disk due to https://github.com/terraform-providers/terraform-provider-azurerm/issues/6029
If I'm following this thread correctly (as we are still using the legacy disk system and were looking to move over) can you not deploy VMs with disks already attached? Is it truly rebooting VMs for each disk (thread in #6314 above)? This feels like a HUGE step backwards especially if the legacy mode we are using is being deprecated.
Also how do you deploy and configure a data disk that is in the source reference image if the data disk block is no longer valid?
@lightdrive, I've worked around it by using ansible at https://github.com/rgl/terraform-ansible-azure-vagrant
This is something I just ran across as well, I'd like to be able to use cloud-init to configure the disks. Any news on a resolution?
This item is next on my list, no ETA yet though sorry. I'll link it to a milestone when I've had chance to size and scope it.
It seems that the work done by @jackofallops have been closed with a note that it needs to be implemented in a different way.
Does anyone have a possible work-around for this?
My use-case are like others have pointed out:
Writing my own scripts to make this instead of using cloud-init seems like a waste. Using the workaround mentioned in https://github.com/terraform-providers/terraform-provider-azurerm/issues/6074#issuecomment-626523919 might be possible, but seems to hacky indeed, and require some large changes to how resources are created.
Alas, was really looking forward to an official fix for this. π
In lieu of that however, here's what I came up with about six months ago having had no option but to make this work at minimum for newly booted VMs (note: this has not been tested with changes to, or replacements of the disks - literally just booting new VMs). I'm also not really a Go person, and as a result this is definitely a hack and nothing even approaching a "good" solution, much less sane contents for a PR. Given that be warned that whatever state is generated is almost certainly destined to be incompatible with whatever shape the official implementation yields should it ever land, but on the off chance it does prove useful in some capacity or simply the embers to spark someone else's imagination, here's the horrible change I made to allow for booting VMs with disk attached such that cloud-init
could run correctly: https://github.com/terraform-providers/terraform-provider-azurerm/commit/6e19897658bb5b79418231ca1c004fde83698b40.
@mal FWIW this is being worked on, however the edge-cases make this more complicated than it appears - in particular we're trying to avoid several limitations from the older VM resources, which is why this isn't being lifted over 1:1 and is taking longer here.
Thanks for the insight @tombuildsstuff, great to know it's still being actively worked on. I put that commit out there in response to the request for possible work-arounds in case it was useful to someone that finds themself in the position I was in previously, where waiting for something to cover all the cases wasn't an option. Please don't take that as any kind of slight or indictment of the ongoing efforts, I definitely support any official solution covering all the cases, in my case it just wasn't possible to wait for it, but I'll be first in line to move definitions over to it when it does land. π
incase this helps anyone else... main part to note is the top line waiting for 3
disks before trying to format them etc
write_files:
- content: |
# Wait for x disks to be available
while [ `ls -l /dev/disk/azure/scsi1 | grep lun | wc -l` -lt 3 ]; do echo waiting on disks...; sleep 5; done
DISK=$1
DISK_PARTITION=$DISK"-part1"
VG=$2
VOL=$3
MOUNTPOINT=$4
# Partition disk
sed -e 's/\s*\([\+0-9a-zA-Z]*\).*/\1/' << EOF | fdisk $DISK
n # new partition
p # primary partition
1 # partition number 1
# default - start at beginning of disk
# default - end of the disk
w # write the partition table
q # and we're done
EOF
# Create physical volume
pvcreate $DISK_PARTITION
# Create volume group
if [[ -z `vgs | grep $VG` ]]; then
vgcreate $VG $DISK_PARTITION
else
vgextend $VG $DISK_PARTITION
fi
# Create logical volume
if [[ -z $SIZE ]]; then
SIZE="100%FREE"
fi
lvcreate -l $SIZE -n $VOL $VG
# Create filesystem
mkfs.ext3 -m 0 /dev/$VG/$VOL
# Add to fstab
echo "/dev/$VG/$VOL $MOUNTPOINT ext3 defaults 0 2" >> /etc/fstab
# Create mount point
mkdir -p $MOUNTPOINT
# Mount
mount $MOUNTPOINT
path: /run/create_fs.sh
permissions: '0700'
runcmd:
- /run/create_fs.sh /dev/disk/azure/scsi1/lun1 vg00 vol1 /oracle
- /run/create_fs.sh /dev/disk/azure/scsi1/lun2 vg00 vol2 /oracle/diag
Simple use case that needs to work. Azure Image has OS Disk and a Data Disk. VM (Linux or Windows) to now be provisioned from the Azure Image. @tombuildsstuff Let's not mark this as off topic again. Data Disks properties need to be configurable at creation. This is blocking too many use cases not being implemented.
@ruandersMSFT that's what this issue is tracking - you can find the latest update here
As per the community note above: Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- which is why comments are marked as off-topic - we ask instead that users add a π to the issue.
In this case should we instead use deprecated azurerm_virtual_machine
resource?
Is there an ETA for this? The previous SSH AAD login for linux extension has been deprecated and the new one requires assigning a system managed identity, which requires use of the identity block that azurerm_virtual_machine doesn't support.. (And since azurerm_linux_virtual_machine doesn't support adding/attaching data/storage disks we are stuck)
Is there another way via terraform to add a system managed identity that doesn't involve using a local-exec provisioner to call az cli using azurerm_virtual_machine?
As a workaround to use azurerm_linux_virtual_machine
, the following cloud-init snippet waits at the bootcmd
stage until the data disk is available:
bootcmd:
- until [ -e /dev/disk/azure/scsi1/lun0 ]; do sleep 1; done
disk_setup:
/dev/disk/azure/scsi1/lun0:
table_type: gpt
layout: True
overwrite: False
fs_setup:
- device: /dev/disk/azure/scsi1/lun0
partition: 1
filesystem: ext4
overwrite: False
mounts:
- [/dev/disk/azure/scsi1/lun0-part1, /data]
growpart:
mode: auto
devices:
- /
- /dev/disk/azure/scsi1/lun0-part1
ignore_growroot_disabled: true
write_files:
- content: |
#!/bin/sh
resize2fs -f /dev/disk/azure/scsi1/lun0-part1
path: /var/lib/cloud/scripts/per-boot/resize2fs.sh
permissions: 0755
The growpart
and write_files
is optional to resize the partition and disk on every boot.
NOTE --> Terraform resource azurerm_windows_virtual_machine_scale_set
supports deploying from a gallery image version (which includes a data disk). I have this working successfully, however deploying a standard VM with azurerm_windows_virtual_machine
resource using the same gallery image version as the source does not support the data disks weirdly....
Any update on this feature?
@redeux You indicated this is blocked by https://github.com/hashicorp/terraform-plugin-sdk/issues/220, but per the latest comments on that issue, it is being closed with a decision to not implement the proposed changes.
Given that, can you provide any additional information on what the future is of this issue? Being blocked by something that isnβt going to happen seems like a dead end.
Is there an update on this? It is a blocker from using a number of custom images that depend on 1 or more data disks
Any progress?
looking to deploy Sophos XG VM in Azure through terraform and this creates data disk from image.
Any traction on this? Do we need to add development resources? This has been asked for, for years now.
Better yet, can we just backport the features to azurerm_virtual_ machine
and not deprecate it? I'd like to use the user_data
field with the old model that should be a trivial feature addition. Due to that not being a field supported in the old model I have to use this new model, which doesn't allow for drive attachment pre-VM creation/boot.
Booting a VM and hot swapping drives via drive attach is a major regression.
I think a hacky work around is that azure deployment templates are able to deploy a VM and attach a disk at creation. So you can:
(1) make an azure deployment template for the VM you need. (It's easy to do this in the Azure console by manually configuring the VM and clicking the "Download template for automation" button (2) Deploy that template using "azurerm_resource_group_template_deployment" (or outside of terraform).
I'd much rather have the terraform resource support this, but I think something like this might be a stopgap. I'm trying to get this integrated into our process now and it's working so far.
I expect this will be marked off-topic, but after nearly 3 years since open, this issue needs more attention.
The AzureRM provider has put its users in a bad place here. There are critical features of Azure that are now inaccessible as mentioned by others in this thread. Because my shared OS image has data disks, I cannot use dedicated hosts, cloud-init, or proper identity support for my virtual machine, and this list will only continue to grow because the cloud never stops moving.
How can we as a community help here? There is clearly a lot of development effort going into this provider, judging by the changelog and rate of pull requests; can we raise the priority of this issue?
There is certainly an opportunity for more transparency on why this hasn't moved and other items are getting development attention.
If there is a clean way to migrate from virtual_machine to deployment template, I can live with that, but current terraform will try to do unexpected things due to how they've implemented deployment templates as well.
@jackofallops it's hard to tell in this thread, but looks like you may have added this to the "blocked" milestone. It's no longer clear in the thread what is blocking this issue. Can you clarify? We are seeing a lot of activity in this thread and it's the third-most π issue.
What is the state of this issue? Is it blocked?
It's currently 3 years old and we still can't build a VM from a template which has data disks?
The azurerm_linux_virtualmachine docs include "storage_data_disk" as a valid block but terraform plan errors out claiming it is unsupported. I tried a dynamic block and a standard block - with a precreated disk to "attach" and "empty" with no disk created - all failed.
When I've seen this error before it was either a syntax error or a no longer supported block type.
Is this a documentation bug?
Versions:
KV C:\Users\ksvietme\repos\Terraform\azure\VMs\linuxvm_2> terraform version
Terraform v1.4.6
on windows_amd64
+ provider registry.terraform.io/hashicorp/azurerm v3.57.0
+ provider registry.terraform.io/hashicorp/random v3.5.1
+ provider registry.terraform.io/hashicorp/template v2.2.0
Error:
β·
β Error: Unsupported block type
β
β on linuxvm_2.main.tf line 162, in resource "azurerm_linux_virtual_machine" "linuxvm01":
β 162: dynamic "storage_data_disk" {
β
β Blocks of type "storage_data_disk" are not expected here.
β΅
Disk creation (works)
resource "azurerm_managed_disk" "lun1" {
name = "lun17865"
location = azurerm_resource_group.linuxvm_rg.location
resource_group_name = azurerm_resource_group.linuxvm_rg.name
storage_account_type = "Standard_LRS"
create_option = "Empty"
disk_size_gb = "100"
tags = {
environment = "staging"
}
}
Call to storage_data_disk:
resource "azurerm_linux_virtual_machine" "linuxvm01" {
location = azurerm_resource_group.linuxvm_rg.location
resource_group_name = azurerm_resource_group.linuxvm_rg.name
size = var.vm_size
# Make sure hostname matches public IP DNS name
name = var.vm_name
computer_name = var.vm_name
# Attach NICs (created in linuxvm_2.network)
network_interface_ids = [
azurerm_network_interface.primary.id,
]
# Reference the cloud-init file rendered earlier
# for post bringup configuration
custom_data = data.template_cloudinit_config.config.rendered
###--- Admin user
admin_username = var.username
admin_password = var.password
disable_password_authentication = false
admin_ssh_key {
username = var.username
public_key = file(var.ssh_key)
}
###--- End Admin User
dynamic "storage_data_disk" {
content {
name = azurerm_managed_disk.lun1.name
managed_disk_id = azurerm_managed_disk.lun1.id
disk_size_gb = azurerm_managed_disk.lun1.disk_size_gb
caching = "ReadWrite"
create_option = "Attach"
lun = 1
}
}
### Image and OS configuration
source_image_reference {
publisher = var.publisher
offer = var.offer
sku = var.sku
version = var.ver
}
os_disk {
name = var.vm_name
caching = var.caching
storage_account_type = var.sa_type
}
# For serial console and monitoring
boot_diagnostics {
storage_account_uri = azurerm_storage_account.diagstorageaccount.primary_blob_endpoint
}
tags = {
# Enable/Disable hyperthreading (requires support ticket to enable feature)
"platformsettings.host_environment.disablehyperthreading" = "false"
}
}
###--- End VM Creation
Thanks. I'm sure I'm missing something here.
So is this not possible, and if not now, will this be possible in the future as the azurerm_virtual_machine becomes depreciated.
`resource "azurerm_linux_virtual_machine" "example_name" { name = "${var.lin_machine_name}"
source_image_id = "/subscriptions/XXXXXXXXX/resourceGroups/example_RG/Microsoft.Compute/galleries/example_gallary/images/example_image/versions/0.0.x"
os_disk { name = "lin_name" caching = "ReadWrite" storage_account_type = "StandardSSD_LRS" }
depends_on = [
] }`
###################### essentially my 'source_image_id' has a snapshot of an image with 2 data disks attached. However when doing a 'terraform apply' I will get the following error... "Original Error: Code="InvalidParameter" Message="StorageProfile.dataDisks.lun does not have required value(s) for image specified in storage profile." Target="storageProfile"" ######################
I have tried using the "data_disk" option, but this is not supported as stated above.
` data_disks { lun = 0 create_option = "FromImage" disk_size_gb = 1024 caching = "None" storage_account_type = "Premium_LRS" }
data_disks { lun = 1 create_option = "FromImage" disk_size_gb = 512 caching = "None" storage_account_type = "Premium_LRS" }`
Are there any other suggestions, or will this be included in terraform in the near future?
I feel I must be missing something here as my scenario seems like it would be so common that this issue would need to have been addressed much sooner.
I am trying to use Packer to build CIS/STIG compliant VMs for GoldenImages. Part of the spec has several folders that need to go onto non root partitions. To achieve this I added a drive added the partitions and moved data around. We also use LVM in order to achieve availability requirements if a partition gets full. I used az cli to boot the VM and I was also able to add an additional data drive using the --data-disk-sizes-gb option so I know the control plane will handle it.
When I try to use the VM with Terraform I get the storageAccount error mentioned above. Is there really no viable workaround for building golden images with multiple disks and using TF to create the VM's?
@shaneholder for now, the generally accepted workaround (which I have used successfully) is to use a secondary azurerm_virtual_machine_data_disk_attachment
resource to attach the disk, and the cloud-init script recommended by @agehrig in this comment.
It would be great to hear from the developers as to exactly why this is still blocked, since it's unclear to everyone here especially given the popularity of the request.
@djryanj thanks for the reply. I'm trying to understand it in the context of my problem though. The image in the gallery already has 2 disks, 1 os and 1 data, and right now i'm not trying to add another disk but that would be the next logical step. The issue I'm having is that I can't even get to the point where the VM has been created.
I ran TF with a trace and found the PUT command that creates the VM and what I believe is happening is that TF seems to be incorrectly adding a "dataDisks": []
element to the JSON sent in the PUT request. If I take the JSON data for the PUT and remove that element and then run the PUT command manually the VM is created with 2 disks as expected.
@shaneholder ah I understand. If the gallery image has 2 disks and is not deployable via Terraform using the azurerm_linux_virtual_machine
resource because of that, I don't think it's solvable using the workaround I suggested and I'm afraid I don't know what to suggest other than moving back to an azurerm_virtual_machine
resource, or getting a working ARM template for the deployment and using something like a azurerm_resource_group_template_deployment
resource to deploy that from the working template, which is awful, but would work.
@tombuildsstuff - I'm sure you can see the activity here. Any input?
A little more information. I just ran the same TF but used a VM image that does not have a data disk built in. That PUT request also has the "dataDisks": []
element in the JSON but instead of failing it succeeds and builds the VM. So it seems that if a VM image has an existing data disk and the dataDisks
element is passed in the JSON then the VM build will fail, however if the VM Image does not have a data disk then the dataDisks
element can be sent and the VM will build.
Another piece to the puzzle. I set the logging option for az cli and noticed that it adds the following dataDisks element when I specify additional disks. The lun:0 object is the disk that is built into the image. If I run similar code in TF the dataDisks
property is an empty array rather than an array that includes the dataDiskImages
from the VM Image Version combined with the additional disks I asked to be attached.
"dataDisks": [
{
"lun": 0,
"managedDisk": {
"storageAccountType": null
},
"createOption": "fromImage"
},
{
"lun": 1,
"managedDisk": {
"storageAccountType": null
},
"createOption": "empty",
"diskSizeGB": 30
},
{
"lun": 2,
"managedDisk": {
"storageAccountType": null
},
"createOption": "empty",
"diskSizeGB": 35
}
]
Alright, so I cloned the repo and fiddled around a bit. I hacked the linux_virtual_machine_resource.go file around line 512. I changed:
DataDisks: &[]compute.DataDisk{},
to:
DataDisks: &[]compute.DataDisk{
{
Lun: utils.Int32(0),
CreateOption: compute.DiskCreateOptionTypesFromImage,
ManagedDisk: &compute.ManagedDiskParameters{},
},
},
And I was able to build my VM with the two drives that are declared in the image in our gallery. Additionally I was also able to add a third disk using the azurerm_managed_disk/azurerm_virtual_machine_data_disk_attachment.
I was trying to determine how to find the dataDiskImages from the image in the gallery but I've not been able to suss that out yet. It seems that what needs to be done is the code should pull the dataDiskImages property and do a similar conversion as it does with the osDisk.
Hoping that @tombuildsstuff can help me out then maybe I can PR a change?
Ok, so on a hunch I completely commented out the DataDisks property and ran it again and it worked, I created a VM with both the included image data drive AND an attached drive.
π hey folks
To give an update on this one, unfortunately this issue is still blocked due to a combination of the behaviour of the Azure API (specifically the CreateOption
field) and limitations of the Terraform Plugin SDK.
We've spent a considerable amount of time trying to solve this; however given the number of use-cases for disks, every technical solution possible using the Terraform Plugin SDK has hit a wall for some subset of users which means that Terraform Plugin Framework is required to solve this. Unfortunately this requires bumping the version of the Terraform Protocol being used - which is going to bump the minimum required version of Terraform.
Although bumping the minimum version of Terraform is something that we've had scheduled for 4.0 for a long time - unfortunately that migration in a codebase this size is non-trivial, due to the design of Terraform Plugin Framework being substantially different to the Terraform Plugin SDK, which (amongst other things) requires breaking configuration changes.
Whilst porting over the existing data_disks
implementation seems a reasonable solution, unfortunately the existing implementation is problematic enough that we'd need to introduce further breaking changes to fix this properly once we go to Terraform Plugin Framework. In the interim the way to attach Data Disks to a Virtual Machine is by using the azurerm_virtual_machine_data_disk_attachment
resource.
Moving forward we plan to open a Meta Issue tracking Terraform Plugin Framework in the not-too-distant future, however there's a number of items that we need to resolve before doing so.
We understand that's disheartening to hear, we're trying to unblock this (and several other) of the larger issues - but equally we don't want to give folks false-hope that this is a quick win when doing so would cause larger issues.
Given the amount of activity on this thread - I'm going to temporarily lock this issue for the moment to avoid setting incorrect expectations - but we'll post an update as soon as we can.
To reiterate/TL;DR: adding support for Terraform Plugin Framework is a high priority for us and will unblock work on this feature request. We plan to open a Meta Issue for that in the not-too-distant future - which we'll post an update about here when that becomes available.
Thank you all for your input, please bear with us - and we'll post an update as soon as we can.
Community Note
Description
Azure allows VMs to be booted with managed data disks pre-attached/attached-on-boot. This enables use cases where
cloud-init
and/or other "on-launch" configuration management tooling is able to prepare them for use as part of the initialisation process.This provider currently only supports this case for individual VMs with the older, deprecated
azurerm_virtual_machine
resource. The newazurerm_linux_virtual_machine
andazurerm_windows_virtual_machine
resources instead opt to push users towards the separateazurerm_virtual_machine_data_disk_attachment
which only attaches data disks to an existing VM post-boot, which fails to service the use case laid out above.This is in contrast to the respective
*_scale_set
providers which (albeit out of necessity) support this behaviour.Please could a repeatable
data_disk
block be added to the new VM resources (analogous to the same block in their scale_set counterparts) in order to allow VMs to be started with managed data disks pre-attached.Thanks! π
New or Affected Resource(s)
azurerm_linux_virtual_machine
azurerm_windows_virtual_machine
Potential Terraform Configuration
References