canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.98k stars 880 forks source link

Azure: cloud-init skips formatting the resource disk (ephemeral0) when there is additional data disks' configuration in user-data #3687

Open ubuntu-server-builder opened 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1879552

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2020-05-19T19:14:22.237422+00:00
date_fix_committed = None
date_fix_released = None
id = 1879552
importance = undecided
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1879552
milestone = None
owner = vtqanh
owner_name = Anh Vo (MSFT)
private = False
status = confirmed
submitter = vtqanh
submitter_name = Anh Vo (MSFT)
tags = ['azure']
duplicates = []

Launchpad user Anh Vo (MSFT)(vtqanh) wrote on 2020-05-19T19:14:22.237422+00:00

Deploying a bionic VM on Azure (Canonical:UbuntunServer:18.04-LTS:latest) with VM Size Standard_DS1_V2 and an additional datadisk with the following config

cloud-config

disk_setup: /dev/disk/azure/scsi1/lun0: table_type: gpt layout: True overwrite: True

fs_setup:

mounts:

Expected Result:

Actual Result:

I used this command to create a VM with datadisk and passing in custom data az vm create -g -n vmname --image Canonical:UbuntuServer:18.04-LTS:latest --admin-username adminuser --ssh-key-value @/home/user/.ssh/key.pub --boot-diagnostics-storage storage_account --size Standard_DS1_V2 --custom-data ./customdata.yml --data-disk-sizes-gb 32

I have attached the cloud-init log and the custom data

ubuntu-server-builder commented 1 year ago

Launchpad user Anh Vo (MSFT)(vtqanh) wrote on 2020-05-19T19:14:22.237422+00:00

Launchpad attachments: cloud-init log

ubuntu-server-builder commented 1 year ago

Launchpad user Ryan Harper(raharper) wrote on 2020-05-19T20:08:26.336831+00:00

Hi Anh,

The issue is with built-in config merging with user-data. The Azure datasource uses this built-in config:

cloud-config

disk_setup: ephemeral0: table_type: gpt layout: [100] overwrite: True

fs_setup:

mounts:

And in your example the user provides this config

cloud-config

disk_setup: /dev/disk/azure/scsi1/lun0: table_type: gpt layout: True overwrite: True

fs_setup:

mounts:

The Azure Datasource will merge these configs together like so:

util.mergemanydict([userdata, builtin]) and the combined config looks like this:

disk_setup: /dev/disk/azure/scsi1/lun0: layout: true overwrite: true table_type: gpt ephemeral0: layout:

As you can see the fs_setup and mounts are lists, and the default merging of lists is replacement; the user-data's fs_setup and mounts will override the built-in config; disk_setup is a dictionary, which by default will merge missing keys.

This is expected behavior, reserving full user control over the built-in config. At this time, the only remedy is for users to replicate the built-in config in their user-data if they would like the ephemeral disk configured the same way as it would without supplying disk configuration.

cloud-config

disk_setup: ephemeral0: table_type: gpt layout: [100] overwrite: True /dev/disk/azure/scsi1/lun0: table_type: gpt layout: True overwrite: True

fs_setup:

mounts:

ubuntu-server-builder commented 1 year ago

Launchpad user Ryan Harper(raharper) wrote on 2020-05-19T20:09:50.906865+00:00

I'd like to explore a couple of options here:

One possible solution is to have fs_setup and mounts work with a dictionary format (as well as supporting lists)

% yprint built-in-new.cfg disk_setup: ephemeral0: layout:

% yprint user-data-new.cfg disk_setup: /dev/disk/azure/scsi1/lun0: layout: true overwrite: true table_type: gpt fs_setup: lun0: device: /dev/disk/azure/scsi1/lun0 filesystem: ext4 partition: 1 mounts: datadisk1:

print(yaml.dump(merged, default_flow_style=False, indent=4)) disk_setup: /dev/disk/azure/scsi1/lun0: layout: true overwrite: true table_type: gpt ephemeral0: layout:

  • 100 overwrite: true table_type: gpt fs_setup: ephemeral0.1: device: ephemeral0.1 filesystem: DEFAULT_FS lun0: device: /dev/disk/azure/scsi1/lun0 filesystem: ext4 partition: 1 mounts: datadisk1:
    • /dev/disk/azure/scsi1/lun0
  • /datadisk1
  • ext4
  • defaults,nofail,discard
  • '0'
  • '0' mnt:
    • /dev/ephemeral0
  • /mnt
  • auto
  • defaults,noexec
ubuntu-server-builder commented 1 year ago

Launchpad user Ryan Harper(raharper) wrote on 2020-05-19T20:14:54.845917+00:00

Another option would be to make it easier for users to indicate they want the defaults in addition to their changes:

disk_setup: builtin: true /dev/disk/azure/scsi1/lun0: {...}

fs_setup:

mounts:

ubuntu-server-builder commented 1 year ago

Launchpad user Anh Vo (MSFT)(vtqanh) wrote on 2020-05-19T20:38:58.489964+00:00

Would changing the fs_setup and mounts to take dictionary affect existing user-data out there? I do like the option of using "builtin" to allow users to keep whatever default setup that the cloud providers have (because it might have changed between different cloud-init versions and the users might be keeping the same user-data for some time without realizing things have been changed and they miss out on some optimization from the platform)

ubuntu-server-builder commented 1 year ago

Launchpad user Dan Watkins(oddbloke) wrote on 2020-05-19T21:09:23+00:00

On Tue, May 19, 2020 at 08:14:54PM -0000, Ryan Harper wrote:

Another option would be to make it easier for users to indicate they want the defaults in addition to their changes:

disk_setup: builtin: true /dev/disk/azure/scsi1/lun0: {...}

fs_setup:

  • builtin
  • device: /dev/disk/azure/scsi1/lun0 partition: 1 filesystem: ext4

mounts:

  • builtin
  • [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

An aside: we use "default" for a similar concept for users[0] and I think that word works in this case too ("include the default disk setup and this additional setup" is a perfectly understandable thing to say, for example); I would suggest using it for consistency across the interface we provide to users.

[0] https://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups