GoogleCloudPlatform / compute-archlinux-image-builder

A tool to build a Arch Linux Image for GCE
Apache License 2.0
283 stars 58 forks source link

Request: enable UEFI_COMPATIBLE on public image #44

Closed shuLhan closed 2 years ago

shuLhan commented 2 years ago

Currently the public image for project arch-linux-gce does not have the UEFI_COMPATIBLE flag, only VIRTIO_SCSI_MULTIQUEUE:

$ gcloud compute images describe-from-family arch --project=arch-linux-gce
archiveSizeBytes: '810095680'
creationTimestamp: '2022-03-15T12:36:46.018-07:00'
description: Arch Linux built on 20220315.
diskSizeGb: '10'
family: arch
guestOsFeatures:
- type: VIRTIO_SCSI_MULTIQUEUE
id: '6771268489385432098'
kind: compute#image
labelFingerprint: 42WmSpB8rSM=
name: arch-v20220315
rawDisk:
  containerType: TAR
  source: ''
selfLink: https://www.googleapis.com/compute/v1/projects/arch-linux-gce/global/images/arch-v20220315
sourceType: RAW
status: READY
storageLocations:
- us

According to documentation [1], an UEFI image add several advantages: secure boot, Virtual Trusted Platform Module, and Integrity monitoring.

Another advantages is to allow user to swap the boot disk from non-arch linux with UEFI_COMPATIBLE to new arch-linux image. For example, here is an error when replacing centos-8 boot disk with arch,

>>> local:  35: gcloud [compute disks create demo-root --zone=asia-southeast1-b --image-project=arch-linux-gce --image-family=arch --type=pd-ssd --size=10GB]
Created [https://www.googleapis.com/compute/v1/projects/<redacted>/zones/asia-southeast1-b/disks/demo-root].
NAME                 ZONE               SIZE_GB  TYPE    STATUS
demo-root  asia-southeast1-b  10       pd-ssd  READY

>>> local:  43: gcloud [compute instances attach-disk demo --zone=asia-southeast1-b --disk=demo-root --boot]
ERROR: (gcloud.compute.instances.attach-disk) Could not fetch resource:
 - UEFI setting must be the same for the instance and the boot disk.

~Question: how to create an image with UEFI_COMPATIBLE enable?~

Request: enable UEFI_COMPATIBLE on public image.

[1] https://cloud.google.com/compute/docs/images/create-delete-deprecate-private-images#guest-os-features

shuLhan commented 2 years ago

For the question "how to create an image with UEFI_COMPATIBLE", I can create new image from existing public image using the following command,

$ gcloud compute images create arch-v20220315-1 \
  --source-image=arch-v20220315 \
  --source-image-project=arch-linux-gce \
  --guest-os-features=VIRTIO_SCSI_MULTIQUEUE,UEFI_COMPATIBLE,GVNIC,SEV_CAPABLE \
  --storage-location=asia
Created [https://www.googleapis.com/compute/v1/projects/<redacted>/global/images/arch-v20220315-1].
NAME              PROJECT            FAMILY  DEPRECATED  STATUS
arch-v20220315-1  <redacted>                      READY

This will store the new image in your project.

So far it's bootable.

shuLhan commented 2 years ago

Thanks @toastwaffle !

shuLhan commented 2 years ago

@toastwaffle

Something is not right with the latest image,

Here is the log,

CSM BBS Table full.

UEFI: Failed to load image.
Description: UEFI Google PersistentDisk
FilePath: PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)
OptionNumber: 1.
Status: Not Found.

BdsDxe: failed to load Boot0001 "UEFI Google PersistentDisk " from PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0): Not Found
SeaBIOS (version 1.8.2-google)
Machine UUID b7ea187a-2668-9586-8f8d-d6d0a2e12b18
found virtio-scsi at 0:3
virtio-scsi vendor='Google' product='PersistentDisk' rev='1' type=0 removable=0
virtio-scsi blksize=512 sectors=20971520 = 10240 MiB
virtio-scsi vendor='Google' product='PersistentDisk' rev='1' type=0 removable=0
virtio-scsi blksize=512 sectors=41943040 = 20480 MiB
drive 0x000f3990: PCHS=0/0/0 translation=lba LCHS=1024/255/63 s=20971520
drive 0x000f3950: PCHS=0/0/0 translation=lba LCHS=1024/255/63 s=41943040
Sending Seabios boot VM event.
Booting from Hard Disk 0...
:: performing fsck on '/dev/sda2'
root: recovering journal
root: clean, 39304/655360 files, 389936/2620672 blocks
:: mounting '/dev/sda2' on real root

Welcome to Arch Linux!

[  OK  ] Created slice Slice /system/dhclient.
[  OK  ] Created slice Slice /system/getty.
[  OK  ] Created slice Slice /system/growpartfs.
[  OK  ] Created slice Slice /system/modprobe.
[  OK  ] Created slice Slice /system/serial-getty.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Dispatch Password …ts to Console Directory Watch.
[  OK  ] Started Forward Password R…uests to Wall Directory Watch.
[  OK  ] Set up automount Arbitrary…s File System Automount Point.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Local Integrity Protected Volumes.
[  OK  ] Reached target Path Units.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slice Units.
[  OK  ] Reached target Swaps.
[  OK  ] Reached target Local Verity Protected Volumes.
[  OK  ] Listening on Device-mapper event daemon FIFOs.
[  OK  ] Listening on Process Core Dump Socket.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Listening on udev Kernel Socket.
         Mounting Huge Pages File System...
         Mounting POSIX Message Queue File System...
         Mounting Kernel Debug File System...
         Mounting Kernel Trace File System...
         Mounting Temporary Directory /tmp...
         Starting Create List of Static Device Nodes...
         Starting Load Kernel Module configfs...
         Starting Load Kernel Module drm...
         Starting Load Kernel Module fuse...
         Starting Journal Service...
         Starting Load Kernel Modules...
         Starting Remount Root and Kernel File Systems...
         Starting Coldplug All udev Devices...
[  OK  ] Mounted Huge Pages File System.
[  OK  ] Mounted POSIX Message Queue File System.
[  OK  ] Mounted Kernel Debug File System.
[  OK  ] Mounted Kernel Trace File System.
[  OK  ] Mounted Temporary Directory /tmp.
[  OK  ] Finished Create List of Static Device Nodes.
[  OK  ] Finished Load Kernel Module configfs.
[  OK  ] Started Journal Service.
[  OK  ] Finished Load Kernel Module drm.
[  OK  ] Finished Load Kernel Module fuse.
[  OK  ] Finished Load Kernel Modules.
[  OK  ] Finished Remount Root and Kernel File Systems.
[    1.794057] systemd-modules-load[153]: Inserted module 'sg'
[    1.882947] systemd[1]: modprobe@drm.service: Deactivated successfully.
[    1.895815] systemd[1]: Finished Load Kernel Module drm.
[    1.909241] systemd[1]: modprobe@fuse.service: Deactivated successfully.
[    1.922562] systemd[1]: Finished Load Kernel Module fuse.
[    1.935877] systemd[1]: Finished Load Kernel Modules.
         Mounting FUSE Control File System...
[    1.950756] systemd[1]: Finished Remount Root and Kernel File Systems.
[    1.965650] systemd[1]: Mounting FUSE Control File System...
[    1.979244] systemd[1]: Mounting Kernel Configuration File System...
         Mounting Kernel Configuration File System...
[    2.005777] systemd[1]: First Boot Wizard was skipped because of a failed condition check (ConditionFirstBoot=yes).
         Starting Flush Journal to Persistent Storage...
         Starting Load/Save Random Seed...
         Starting Apply Kernel Variables...
         Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Coldplug All udev Devices.
[  OK  ] Mounted FUSE Control File System.
[  OK  ] Mounted Kernel Configuration File System.
[  OK  ] Finished Load/Save Random Seed.
[  OK  ] Finished Apply Kernel Variables.
[  OK  ] Finished Flush Journal to Persistent Storage.
[    2.091427] systemd[1]: Rebuild Hardware Database was skipped because of a failed condition check (ConditionNeedsUpdate=/etc).
[    2.118335] systemd[1]: Starting Flush Journal to Persistent Storage...
[    2.132453] systemd[1]: Starting Load/Save Random Seed...
[    2.145866] systemd[1]: Starting Apply Kernel Variables...
[    2.159309] systemd[1]: Create System Users was skipped because of a failed condition check (ConditionNeedsUpdate=/etc).
[    2.172611] systemd[1]: Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Create Static Device Nodes in /dev.
[    2.187152] systemd[1]: Finished Coldplug All udev Devices.
[    2.212585] systemd[1]: Mounted FUSE Control File System.
[    2.225769] systemd[1]: Mounted Kernel Configuration File System.
[    2.239133] systemd[1]: Finished Load/Save Random Seed.
[    2.252436] systemd[1]: Finished Apply Kernel Variables.
[  OK  ] Reached target Preparation for Local File Systems.
[  OK  ] Reached target Local File Systems.
[    2.267217] systemd[1]: Finished Flush Journal to Persistent Storage.
[    2.305776] systemd[1]: First Boot Complete was skipped because of a failed condition check (ConditionFirstBoot=yes).
[    2.319146] systemd[1]: Finished Create Static Device Nodes in /dev.
[    2.332529] systemd[1]: Reached target Preparation for Local File Systems.
[    2.345766] systemd[1]: Virtual Machine and Container Storage (Compatibility) was skipped because of a failed condition check (ConditionPathExists=/var
/lib/machines.raw).
         Starting Create Volatile Files and Directories...
[    2.360487] systemd[1]: Reached target Local File Systems.
[    2.385879] systemd[1]: Rebuild Dynamic Linker Cache was skipped because all trigger condition checks failed.
[    2.399179] systemd[1]: Set Up Additional Binary Formats was skipped because all trigger condition checks failed.
[    2.412456] systemd[1]: Store a System Token in an EFI Variable was skipped because of a failed condition check (ConditionPathExists=/sys/firmware/efi/
efivars/LoaderFeatures-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f).
[    2.429106] systemd[1]: Commit a transient machine-id on disk was skipped because of a failed condition check (ConditionPathIsMountPoint=/etc/machine-i
d).
[    2.442421] systemd[1]: Starting Create Volatile Files and Directories...
         Starting Rule-based Manage…for Device Events and Files...
[    2.457296] systemd[1]: Starting Rule-based Manager for Device Events and Files...
[    2.471673] systemd-udevd[167]: Network interface NamePolicy= disabled on kernel command line, ignoring.
[  OK  ] Started Rule-based Manager for Device Events and Files.
[    2.519474] systemd[1]: Started Rule-based Manager for Device Events and Files.
[    2.535769] systemd[1]: Finished Create Volatile Files and Directories.
[  OK  ] Finished Create Volatile Files and Directories.
         Starting Network Time Synchronization...
[    2.564460] systemd[1]: Rebuild Journal Catalog was skipped because of a failed condition check (ConditionNeedsUpdate=/var).
[    2.569967] systemd[1]: Starting Network Time Synchronization...
[    2.575509] systemd[1]: Update is Completed was skipped because all trigger condition checks failed.
[    2.589107] systemd[1]: Starting Record System Boot/Shutdown in UTMP...
         Starting Record System Boot/Shutdown in UTMP...
[    2.628796] systemd[1]: Finished Record System Boot/Shutdown in UTMP.
[  OK  ] Finished Record System Boot/Shutdown in UTMP.
[  OK  ] Started Network Time Synchronization.
[  OK  ] Reached target System Initialization.
[    2.713623] piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
[  OK  ] Started NSS cache refresh timer.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target System Time Set.
[    2.690344] systemd[1]: Started Network Time Synchronization.
[    2.769233] systemd[1]: Reached target System Initialization.
[    2.782548] systemd[1]: Started NSS cache refresh timer.
[    2.795700] systemd[1]: Started Daily Cleanup of Temporary Directories.
[    2.809052] systemd[1]: Reached target System Time Set.
[    2.822374] systemd[1]: Started Daily verification of password and group files.
[  OK  ] Started Daily verification of password and group files.
[  OK  ] Started Daily verification of password and group files.
[  OK  ] Reached target Timer Units.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Socket Units.
[  OK  ] Reached target Basic System.
[    2.849297] systemd[1]: Reached target Timer Units.
[    2.904429] systemd[1]: Listening on D-Bus System Message Bus Socket.
[    2.919207] systemd[1]: Reached target Socket Units.
[    2.932365] systemd[1]: Reached target Basic System.
[    2.945737] systemd[1]: Starting D-Bus System Message Bus...
         Starting D-Bus System Message Bus...
[  OK  ] Started dhclient on eth0.
[    2.973415] systemd[1]: Started dhclient on eth0.
[    2.992440] systemd[1]: Reached target Network.
[  OK  ] Reached target Network.
[  OK  ] Reached target Network is Online.
         Starting Google Compute Engine Guest Agent...
[    3.008398] dhclient[200]: Internet Systems Consortium DHCP Client 4.4.3
[    3.037251] dhclient[200]: Copyright 2004-2022 Internet Systems Consortium.
[    3.038673] dhclient[200]: Internet Systems Consortium DHCP Client 4.4.3
[    3.052303] dhclient[200]: Copyright 2004-2022 Internet Systems Consortium.
[    3.072323] dhclient[200]: All rights reserved.
[    3.085651] dhclient[200]: For info, please visit https://www.isc.org/software/dhcp/
[    3.099092] systemd[1]: Reached target Network is Online.
[    3.112372] dhclient[200]: All rights reserved.
[    3.128706] dhclient[202]: Cannot find device "eth0"
[    3.142400] systemd[1]: Starting Google Compute Engine Guest Agent...
[    3.144274] dhclient[200]: For info, please visit https://www.isc.org/software/dhcp/
[    3.149179] dhclient[200]: Failed to get interface index: No such device
[    3.162322] dhclient[200]: If you think you have received this message due to a bug rather
[    3.175637] dhclient[200]: than a configuration issue please read the section on submitting
[    3.188997] dhclient[200]: bugs on either our web page at www.isc.org or in the README file
[    3.202326] dhclient[200]: before submitting a bug.  These pages explain the proper
[    3.215642] dhclient[200]: process and the information we find helpful for debugging.
[    3.229026] dhclient[200]: exiting.
[    3.242462] google_guest_agent[204]: ERROR logger.go:75 Continuing without cloud logging due to error in initialization: google: could not find default
 credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
[    3.259063] google_guest_agent[204]: GCE Agent Started (version )
[    3.272357] systemd[1]: Starting Google Compute Engine Shutdown Scripts...
[    3.285730] dhclient[200]:
[    3.299239] systemd-udevd[175]: Using default interface naming scheme 'v250'.
[    3.312442] dhclient[200]: Failed to get interface index: No such device
[    3.325691] dhclient[200]:
[    3.339044] dhclient[200]: If you think you have received this message due to a bug rather
[    3.352407] dhclient[200]: than a configuration issue please read the section on submitting
[    3.365742] dhclient[200]: bugs on either our web page at www.isc.org or in the README file
         Starting Google Compute Engine Shutdown Scripts...
[    3.380342] dhclient[200]: before submitting a bug.  These pages explain the proper
[    3.405660] dhclient[200]: process and the information we find helpful for debugging.
[    3.419062] systemd[1]: Starting Grows the filesystem and partition at /...
[    3.432379] dhclient[200]:
[    3.445840] dhclient[200]: exiting.
         Starting Grows the filesystem and partition at /...
[    3.473430] systemd[1]: Pacman keyring initialization was skipped because of a failed condition check (ConditionDirectoryNotEmpty=!/etc/pacman.d/gnupg)
.
[    3.489152] systemd[1]: SSH Key Generation was skipped because all trigger condition checks failed.
[    3.491411] systemd[1]: Starting User Login Management...
         Starting User Login Management...
         Starting Permit User Sessions...
[  OK  ] Started D-Bus System Message Bus.
[    3.520098] systemd[1]: Starting Permit User Sessions...
[    3.559133] systemd[1]: Started D-Bus System Message Bus.
[    3.572495] growpartfs[285]: NOCHANGE: partition 2 could only be grown by 2015 [fudge=2048]
[    3.585840] systemd-logind[301]: Watching system buttons on /dev/input/event0 (Power Button)
[    3.589034] systemd-logind[301]: Watching system buttons on /dev/input/event1 (Sleep Button)
[    3.590666] systemd-logind[301]: Watching system buttons on /dev/input/event3 (AT Translated Set 2 keyboard)
[    3.605690] systemd-logind[301]: New seat seat0.
[    3.619025] systemd[1]: dhclient@eth0.service: Main process exited, code=exited, status=1/FAILURE
[    3.622470] growpartfs[317]: resize2fs 1.46.5 (30-Dec-2021)
[    3.635765] systemd[1]: dhclient@eth0.service: Failed with result 'exit-code'.
[    3.649048] growpartfs[317]: The filesystem is already 2620672 (4k) blocks long.  Nothing to do!
[    3.662328] systemd[1]: Finished Google Compute Engine Shutdown Scripts.
[  OK  ] Finished Google Compute Engine Shutdown Scripts.
[  OK  ] Finished Grows the filesystem and partition at /.
[    3.690067] systemd[1]: growpartfs@-.service: Deactivated successfully.
[    3.715861] systemd[1]: Finished Grows the filesystem and partition at /.
[    3.729073] systemd[1]: Finished Permit User Sessions.
[  OK  ] Finished Permit User Sessions.
[    3.758052] dbus-daemon[196]: [system] Successfully activated service 'org.freedesktop.systemd1'
[    3.772527] systemd[1]: Started User Login Management.
[  OK  ] Started User Login Management.
[    3.811053] systemd[1]: Found device /dev/ttyS0.
[  OK  ] Found device /dev/ttyS0.
[    3.857518] systemd[1]: Started Getty on tty1.
[  OK  ] Started Getty on tty1.
[    3.884395] systemd[1]: Started Serial Getty on ttyS0.
[  OK  ] Started Serial Getty on ttyS0.
[    3.909931] systemd[1]: Reached target Login Prompts.
[  OK  ] Reached target Login Prompts.
[    6.188630] systemd[1]: Starting NSS cache refresh...
         Starting NSS cache refresh...
[    6.246150] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Refreshing passwd entry cache
[    6.259074] oslogin_cache_refresh[321]: Refreshing passwd entry cache
[    6.272372] oslogin_cache_refresh[321]: Failure getting users, quitting
[    6.285784] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Failure getting users, quitting
[    6.299065] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Failed to get users, not updating passwd cache file, removing /etc/oslogin_passw
d.cache.bak.
[    6.312409] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Refreshing group entry cache
[    6.325724] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Failure getting groups, quitting
[    6.339018] google_oslogin_nss_cache[321]: oslogin_cache_refresh[321]: Failed to get groups, not updating group cache file, removing /etc/oslogin_group
.cache.bak.
[  OK  ] Finished NSS cache refresh.
[    6.357123] systemd[1]: google-oslogin-cache.service: Deactivated successfully.
[    6.371113] oslogin_cache_refresh[321]: Failed to get users, not updating passwd cache file, removing /etc/oslogin_passwd.cache.bak.
[    6.385935] systemd[1]: Finished NSS cache refresh.
[    6.399105] oslogin_cache_refresh[321]: Refreshing group entry cache
[    6.412458] oslogin_cache_refresh[321]: Failure getting groups, quitting
[    6.425816] oslogin_cache_refresh[321]: Failed to get groups, not updating group cache file, removing /etc/oslogin_group.cache.bak.

Arch Linux 5.17.7-arch1-1 (ttyS0)

archlinux login: [   93.146996] systemd[1]: google-guest-agent.service: start operation timed out. Terminating.
[  108.149265] google_guest_agent[204]: CRITICAL main.go:322 error registering service: failed to shutdown within timeout 15s
[  108.152662] systemd[1]: google-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
[  108.152744] systemd[1]: google-guest-agent.service: Failed with result 'timeout'.
[  108.152794] systemd[1]: Failed to start Google Compute Engine Guest Agent.
[  108.153403] systemd[1]: Starting Google Compute Engine Startup Scripts...
[  108.156038] systemd[1]: Started OpenSSH Daemon.
[  108.181912] sshd[323]: Server listening on 0.0.0.0 port 22.
[  108.182352] sshd[323]: Server listening on :: port 22.
[  108.398289] systemd[1]: google-guest-agent.service: Scheduled restart job, restart counter is at 1.
[  108.398597] systemd[1]: Stopped Google Compute Engine Guest Agent.
[  108.398663] systemd[1]: Starting Google Compute Engine Guest Agent...
[  108.409031] google_guest_agent[328]: ERROR logger.go:75 Continuing without cloud logging due to error in initialization: google: could not find default
 credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
[  108.409169] google_guest_agent[328]: GCE Agent Started (version )
^B^[[5~[  198.646939] systemd[1]: google-guest-agent.service: start operation timed out. Terminating.
[  213.649277] google_guest_agent[328]: CRITICAL main.go:322 error registering service: failed to shutdown within timeout 15s
[  213.652122] systemd[1]: google-guest-agent.service: Main process exited, code=exited, status=1/FAILURE
[  213.652400] systemd[1]: google-guest-agent.service: Failed with result 'timeout'.
[  213.652450] systemd[1]: Failed to start Google Compute Engine Guest Agent.
[  213.897376] systemd[1]: google-guest-agent.service: Scheduled restart job, restart counter is at 2.
[  213.897605] systemd[1]: Stopped Google Compute Engine Guest Agent.
[  213.899419] systemd[1]: Starting Google Compute Engine Guest Agent...
[  213.909456] google_guest_agent[378]: ERROR logger.go:75 Continuing without cloud logging due to error in initialization: google: could not find default
 credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
[  213.909594] google_guest_agent[378]: GCE Agent Started (version )

First, the error on UEFI UEFI: Failed to load image.

Second, the error on dhclient [ 3.128706] dhclient[202]: Cannot find device "eth0".

Third, the error on google-guest-agent, which may caused by above error.

toastwaffle commented 2 years ago

Could you share the configuration you used to generate that VM (preferably as a gcloud command)?

shuLhan commented 2 years ago

@toastwaffle

Seems like the GVNIC feature cause this issue because its require custom driver [1].

gcloud command when creating instance,

gcloud compute instances create {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}} \
    --image-project=arch-linux-gce \
    --image-family=arch \
    --custom-cpu=1 \
    --custom-memory=4GB \
    --metadata=block-project-ssh-keys=TRUE

Then I recreate the boot disk image, this time by enabling specific OS features,

gcloud compute instances stop {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}}

gcloud compute instances detach-disk {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}} \
    --disk={{.Val "host::name"}}

gcloud compute disks delete {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}} \
    --quiet

gcloud compute disks create {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}} \
    --image-project=arch-linux-gce \
    --image-family=arch \
    --type=pd-ssd \
    --guest-os-features=VIRTIO_SCSI_MULTIQUEUE

gcloud compute instances attach-disk {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}} \
    --disk={{.Val "host::name"}} \
    --boot

gcloud compute instances start {{.Val "host::name"}} \
    --zone={{.Val "gcloud::zone"}}

Now, the VM can receive ping.

[1] https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux#manual-configuration

shuLhan commented 2 years ago

Another things that I found, the UUID in fstab is empty.

$ cat /etc/fstab
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# LABEL=root
UUID=                       /           ext4        defaults    0 1

The blkid,

$ blkid
/dev/sda2: LABEL="root" UUID="efd52710-9bb7-4ac8-a3a8-7a6f6df6ee4b" BLOCK_SIZE="4096" TYPE="ext4" PARTLABEL="root" PARTUUID="f5596780-85e9-c44c-92ee-3c8ecda0c289"
tashian commented 2 years ago

+1

I'm trying to enable Secure Boot with this image (is it supported?) and am getting the same UEFI error.

I created the instance with:

gcloud compute instances create arch-test \
   --project=carl-test-xxxx \
   --zone=us-west1-c \
   --machine-type=e2-medium \
   --network-interface=network-tier=PREMIUM,subnet=default \
   --maintenance-policy=MIGRATE \
   --provisioning-model=STANDARD \
   --service-account=xxxx-compute@developer.gserviceaccount.com \
   --scopes=https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append \
   --image-project=arch-linux-gce --image-family=arch \
   --shielded-secure-boot \
   --shielded-vtpm \
   --shielded-integrity-monitoring \
   --reservation-affinity=any

The serial console output is:

[2J[=3h

UEFI: Failed to load image.

Description: UEFI Google PersistentDisk 

FilePath: PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)

OptionNumber: 1.

Status: Not Found.

BdsDxe: failed to load Boot0001 "UEFI Google PersistentDisk " from PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0): Not Found

[=3h

UEFI: Failed to load image.

Description: UEFI Google PersistentDisk 

FilePath: PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)

OptionNumber: 1.

Status: Not Found.

If I use --no-shielded-secure-boot it will still print this error, then fall back on the BIOS and boot that way.

shuLhan commented 2 years ago

@tashian

AFAIK, secure boot require UEFI partition and specific linux kernel compilation flags [1][2][3]. None of them provided by this image and by Arch Linux default kernel.

[1] https://cloud.google.com/compute/shielded-vm/docs/shielded-vm#secure-boot [2] https://github.com/GoogleCloudPlatform/compute-archlinux-image-builder/blob/16732c48923294a15b845159c7249d77b984c247/build-arch-gce#L43 [3] https://bugs.archlinux.org/task/53864

shuLhan commented 2 years ago

@toastwaffle

I try to reproduce by building the image directly, turns out the image working as expected.

 21:30:04 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-test.aww 11 21
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 1: gcloud config configurations activate personal
Activated [personal].

>>> local:  11: gcloud [compute images list --no-standard-images]
NAME: arch-v20220630
PROJECT: arch-builder
FAMILY:
DEPRECATED:
STATUS: READY

>>> local:  13: gcloud [compute instances create arch-test --zone=asia-southeast1-b --image=arch-v20220630 --machine-type=f1-micro]
Created [https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test].
NAME: arch-test
ZONE: asia-southeast1-b
MACHINE_TYPE: f1-micro
PREEMPTIBLE:
INTERNAL_IP: 10.148.0.6
EXTERNAL_IP: 34.143.255.178
STATUS: RUNNING

>>> local:  18: gcloud [compute instances describe arch-test --zone=asia-southeast1-b]
canIpForward: false
cpuPlatform: Intel Broadwell
creationTimestamp: '2022-06-30T07:30:23.752-07:00'
deletionProtection: false
disks:
- autoDelete: true
  boot: true
  deviceName: persistent-disk-0
  diskSizeGb: '10'
  guestOsFeatures:
  - type: VIRTIO_SCSI_MULTIQUEUE
  - type: UEFI_COMPATIBLE
  - type: GVNIC
  index: 0
  interface: SCSI
  kind: compute#attachedDisk
  mode: READ_WRITE
  source: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/disks/arch-test
  type: PERSISTENT
fingerprint: eEqeMjLabj4=
id: '6196518194913480080'
kind: compute#instance
labelFingerprint: 42WmSpB8rSM=
lastStartTimestamp: '2022-06-30T07:30:31.626-07:00'
machineType: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/machineTypes/f1-micro
metadata:
  fingerprint: ab_GBZOinXM=
  kind: compute#metadata
name: arch-test
networkInterfaces:
- accessConfigs:
  - kind: compute#accessConfig
    name: external-nat
    natIP: 34.143.255.178
    networkTier: PREMIUM
    type: ONE_TO_ONE_NAT
  fingerprint: fHvIi78m7Hc=
  kind: compute#networkInterface
  name: nic0
  network: https://www.googleapis.com/compute/v1/projects/arch-builder/global/networks/default
  networkIP: 10.148.0.6
  stackType: IPV4_ONLY
  subnetwork: https://www.googleapis.com/compute/v1/projects/arch-builder/regions/asia-southeast1/subnetworks/default
scheduling:
  automaticRestart: true
  onHostMaintenance: MIGRATE
  preemptible: false
  provisioningModel: STANDARD
selfLink: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test
serviceAccounts:
- email: 580833646678-compute@developer.gserviceaccount.com
  scopes:
  - https://www.googleapis.com/auth/devstorage.read_only
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/monitoring.write
  - https://www.googleapis.com/auth/pubsub
  - https://www.googleapis.com/auth/service.management.readonly
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/trace.append
shieldedInstanceConfig:
  enableIntegrityMonitoring: true
  enableSecureBoot: false
  enableVtpm: true
shieldedInstanceIntegrityPolicy:
  updateAutoLearnPolicy: true
startRestricted: false
status: RUNNING
tags:
  fingerprint: 42WmSpB8rSM=
zone: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b

 21:30:39 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-test.aww 21
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 1: gcloud config configurations activate personal
Activated [personal].

>>> local:  21: gcloud [compute ssh --zone=asia-southeast1-b --command=cat /etc/fstab; timedatectl show-timesync; locale arch-test]
# LABEL=root
UUID=066f95b8-9d59-4501-9b84-bc28b2643580       /               ext4            defaults        0 1

SystemNTPServers=metadata.google.internal
FallbackNTPServers=0.arch.pool.ntp.org 1.arch.pool.ntp.org 2.arch.pool.ntp.org 3.arch.pool.ntp.org
ServerName=metadata.google.internal
ServerAddress=169.254.169.254
RootDistanceMaxUSec=5s
PollIntervalMinUSec=32s
PollIntervalMaxUSec=34min 8s
PollIntervalUSec=2min 8s
NTPMessage={ Leap=0, Version=4, Mode=4, Stratum=2, Precision=-20, RootDelay=0, RootDispersion=91us, Reference=474F4F47, OriginateTimestamp=Thu 2022-06-30 14:32:42 UTC, ReceiveTimestamp=Thu 2022-06-30 14:32:42 UTC, TransmitTimestamp=Thu 2022-06-30 14:32:42 UTC, DestinationTimestamp=Thu 2022-06-30 14:32:42 UTC, Ignored=no, PacketCount=3, Jitter=46us }
Frequency=14953
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

 21:34:06 ~/src/compute-archlinux-image-builder
(ins) 1 $

And then I try to recreate the image from arch-linux-gce project.

 21:38:04 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-public-images-test.aww 3 9
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 1: gcloud config configurations activate personal
Activated [personal].

>>> local:   3: gcloud [compute instances create arch-test --zone=asia-southeast1-b --image-project=arch-linux-gce --image-family=arch --machine-type=f1-micro]
Created [https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test].
NAME: arch-test
ZONE: asia-southeast1-b
MACHINE_TYPE: f1-micro
PREEMPTIBLE:
INTERNAL_IP: 10.148.0.7
EXTERNAL_IP: 34.143.255.178
STATUS: RUNNING

>>> local:   9: gcloud [compute instances describe arch-test --zone=asia-southeast1-b]
canIpForward: false
cpuPlatform: Intel Broadwell
creationTimestamp: '2022-06-30T07:39:50.558-07:00'
deletionProtection: false
disks:
- autoDelete: true
  boot: true
  deviceName: persistent-disk-0
  diskSizeGb: '10'
  guestOsFeatures:
  - type: VIRTIO_SCSI_MULTIQUEUE
  - type: UEFI_COMPATIBLE
  - type: GVNIC
  index: 0
  interface: SCSI
  kind: compute#attachedDisk
  mode: READ_WRITE
  source: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/disks/arch-test
  type: PERSISTENT
fingerprint: -yQdgmdb7OI=
id: '4903633301086931802'
kind: compute#instance
labelFingerprint: 42WmSpB8rSM=
lastStartTimestamp: '2022-06-30T07:40:08.573-07:00'
machineType: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/machineTypes/f1-micro
metadata:
  fingerprint: ab_GBZOinXM=
  kind: compute#metadata
name: arch-test
networkInterfaces:
- accessConfigs:
  - kind: compute#accessConfig
    name: external-nat
    natIP: 34.143.255.178
    networkTier: PREMIUM
    type: ONE_TO_ONE_NAT
  fingerprint: xwkuHtTnICs=
  kind: compute#networkInterface
  name: nic0
  network: https://www.googleapis.com/compute/v1/projects/arch-builder/global/networks/default
  networkIP: 10.148.0.7
  stackType: IPV4_ONLY
  subnetwork: https://www.googleapis.com/compute/v1/projects/arch-builder/regions/asia-southeast1/subnetworks/default
scheduling:
  automaticRestart: true
  onHostMaintenance: MIGRATE
  preemptible: false
  provisioningModel: STANDARD
selfLink: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test
serviceAccounts:
- email: 580833646678-compute@developer.gserviceaccount.com
  scopes:
  - https://www.googleapis.com/auth/devstorage.read_only
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/monitoring.write
  - https://www.googleapis.com/auth/pubsub
  - https://www.googleapis.com/auth/service.management.readonly
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/trace.append
shieldedInstanceConfig:
  enableIntegrityMonitoring: true
  enableSecureBoot: false
  enableVtpm: true
shieldedInstanceIntegrityPolicy:
  updateAutoLearnPolicy: true
startRestricted: false
status: RUNNING
tags:
  fingerprint: 42WmSpB8rSM=
zone: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b

 21:40:10 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-public-images-test.aww 12
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 1: gcloud config configurations activate personal
Activated [personal].

>>> local:  12: gcloud [compute ssh --zone=asia-southeast1-b --command=cat /etc/fstab; timedatectl show-timesync; locale arch-test]
Warning: Permanently added 'compute.4903633301086931802' (ED25519) to the list of known hosts.
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# LABEL=root
UUID=                           /               ext4            defaults        0 1

SystemNTPServers=metadata.google.internal
FallbackNTPServers=0.arch.pool.ntp.org 1.arch.pool.ntp.org 2.arch.pool.ntp.org 3.arch.pool.ntp.org
ServerName=metadata.google.internal
ServerAddress=169.254.169.254
RootDistanceMaxUSec=5s
PollIntervalMinUSec=32s
PollIntervalMaxUSec=34min 8s
PollIntervalUSec=1min 4s
NTPMessage={ Leap=0, Version=4, Mode=4, Stratum=2, Precision=-20, RootDelay=0, RootDispersion=76us, Reference=474F4F47, OriginateTimestamp=Thu 2022-06-30 14:41:16 UTC, ReceiveTimestamp=Thu 2022-06-30 14:41:16 UTC, TransmitTimestamp=Thu 2022-06-30 14:41:16 UTC, DestinationTimestamp=Thu 2022-06-30 14:41:16 UTC, Ignored=no, PacketCount=2, Jitter=1.177ms }
Frequency=1595580
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

 21:42:19 ~/src/compute-archlinux-image-builder
(ins) 1 $

The VM now can boot succesfully but the fstab is empty. Maybe the public image is not cleanly build?

lcastelli commented 2 years ago

I've reworked the image script to be UEFI based now, and uploaded a new image. Can you check if all works as intended now?

shuLhan commented 2 years ago

@lcastelli

Thanks, I think its working as expected now.

 23:33:33 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-test-image-official.aww 5 11
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 3: gcloud config configurations activate personal
Activated [personal].

>>> local:   5: gcloud [compute instances create arch-test --zone=asia-southeast1-b --image-project=arch-linux-gce --image-family=arch --machine-type=f1-micro]
Created [https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test].
NAME: arch-test
ZONE: asia-southeast1-b
MACHINE_TYPE: f1-micro
PREEMPTIBLE:
INTERNAL_IP: 10.148.0.9
EXTERNAL_IP: 34.143.255.178
STATUS: RUNNING

>>> local:  11: gcloud [compute instances describe arch-test --zone=asia-southeast1-b]
canIpForward: false
cpuPlatform: Intel Broadwell
creationTimestamp: '2022-06-30T09:34:03.074-07:00'
deletionProtection: false
disks:
- autoDelete: true
  boot: true
  deviceName: persistent-disk-0
  diskSizeGb: '10'
  guestOsFeatures:
  - type: GVNIC
  - type: UEFI_COMPATIBLE
  - type: VIRTIO_SCSI_MULTIQUEUE
  index: 0
  interface: SCSI
  kind: compute#attachedDisk
  mode: READ_WRITE
  source: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/disks/arch-test
  type: PERSISTENT
fingerprint: WTaZqnfIbKU=
id: '841508220996447893'
kind: compute#instance
labelFingerprint: 42WmSpB8rSM=
lastStartTimestamp: '2022-06-30T09:34:18.681-07:00'
machineType: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/machineTypes/f1-micro
metadata:
  fingerprint: ab_GBZOinXM=
  kind: compute#metadata
name: arch-test
networkInterfaces:
- accessConfigs:
  - kind: compute#accessConfig
    name: external-nat
    natIP: 34.143.255.178
    networkTier: PREMIUM
    type: ONE_TO_ONE_NAT
  fingerprint: Rv_wdAsg16U=
  kind: compute#networkInterface
  name: nic0
  network: https://www.googleapis.com/compute/v1/projects/arch-builder/global/networks/default
  networkIP: 10.148.0.9
  stackType: IPV4_ONLY
  subnetwork: https://www.googleapis.com/compute/v1/projects/arch-builder/regions/asia-southeast1/subnetworks/default
scheduling:
  automaticRestart: true
  onHostMaintenance: MIGRATE
  preemptible: false
  provisioningModel: STANDARD
selfLink: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b/instances/arch-test
serviceAccounts:
- email: 580833646678-compute@developer.gserviceaccount.com
  scopes:
  - https://www.googleapis.com/auth/devstorage.read_only
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/monitoring.write
  - https://www.googleapis.com/auth/pubsub
  - https://www.googleapis.com/auth/service.management.readonly
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/trace.append
shieldedInstanceConfig:
  enableIntegrityMonitoring: true
  enableSecureBoot: false
  enableVtpm: true
shieldedInstanceIntegrityPolicy:
  updateAutoLearnPolicy: true
startRestricted: false
status: RUNNING
tags:
  fingerprint: 42WmSpB8rSM=
zone: https://www.googleapis.com/compute/v1/projects/arch-builder/zones/asia-southeast1-b

 23:38:04 ~/src/compute-archlinux-image-builder
(ins) 1 $ awwan local gcloud-test-image-official.aww 14
--- BaseDir: /home/ms
>>> loading "/home/ms/src/compute-archlinux-image-builder/awwan.env" ...
--- require 3: gcloud config configurations activate personal
Activated [personal].

>>> local:  14: gcloud [compute ssh --zone=asia-southeast1-b --command=lsblk -o NAME,UUID,MOUNTPOINTS; cat /etc/fstab; timedatectl show-timesync; localectl arch-test]
NAME   UUID                                 MOUNTPOINTS
sda
|-sda1 749A-1CB7                            /efi
`-sda2 d9c5244b-53c1-4f3c-9d7c-5b7068cb6868 /
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# LABEL=root
UUID=d9c5244b-53c1-4f3c-9d7c-5b7068cb6868       /               ext4            rw,discard,errors=remount-ro    0 1

# LABEL=efi
UUID=749A-1CB7                  /efi            vfat            uid=root,gid=root,umask=022,showexec    0 0

SystemNTPServers=metadata.google.internal
FallbackNTPServers=0.arch.pool.ntp.org 1.arch.pool.ntp.org 2.arch.pool.ntp.org 3.arch.pool.ntp.org
ServerName=metadata.google.internal
ServerAddress=169.254.169.254
RootDistanceMaxUSec=5s
PollIntervalMinUSec=32s
PollIntervalMaxUSec=34min 8s
PollIntervalUSec=2min 8s
NTPMessage={ Leap=0, Version=4, Mode=4, Stratum=2, Precision=-20, RootDelay=0, RootDispersion=91us, Reference=474F4F47, OriginateTimestamp=Thu 2022-06-30 16:36:31 UTC, ReceiveTimestamp=Thu 2022-06-30 16:36:31 UTC, TransmitTimestamp=Thu 2022-06-30 16:36:31 UTC, DestinationTimestamp=Thu 2022-06-30 16:36:31 UTC, Ignored=no, PacketCount=3, Jitter=610us }
Frequency=-251587
   System Locale: LANG=en_US.UTF-8
       VC Keymap: n/a
      X11 Layout: n/a

UPDATE: serial port output,

CSM BBS Table full.
BdsDxe: loading Boot0001 "UEFI Google PersistentDisk " from PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)
BdsDxe: starting Boot0001 "UEFI Google PersistentDisk " from PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)

UEFI: Attempting to start image.
Description: UEFI Google PersistentDisk
FilePath: PciRoot(0x0)/Pci(0x3,0x0)/Scsi(0x1,0x0)
OptionNumber: 1.

error: no suitable video mode found.
  Booting `Linux'

Loading Linux linux ...
Loading initial ramdisk ...
error: no suitable video mode found.
Booting in blind mode
:: performing fsck on '/dev/sda2'
root: clean, 39837/636480 files, 394403/2544128 blocks
:: mounting '/dev/sda2' on real root

Welcome to Arch Linux!

BTW, does Arch kernel support GVNIC?

lcastelli commented 2 years ago

Yes, gVNIC should be supported out of the box. Let me know if you hit any issue.

shuLhan commented 2 years ago

OK, thanks @lcastelli . Closing this issue now. I will open different issue when I found any problem.

Deepansharora27 commented 1 year ago

Does Enabling UEFI or let's say any other flag like GVNIC on a Public Image, changes the structure of an existing Image ?

toastwaffle commented 1 year ago

No, there will be no effect to any existing virtual machines and their disk images when we make changes to the public images. When you create a new virtual machine from a public image, it is making a full copy of that image to use as the VM's disk; the public image is not referred to again.

Deepansharora27 commented 1 year ago

@toastwaffle One further question that I have also have in mind is that. I am downloading a custom Ubuntu LTS Image and deploying it to two places. Firstly I am deploying it to a QEMU Emulator and spinning up an instance out of it on the QEMU Emulator. Then Secondly I am taking this image importing it to GCP and then spinning up a Instance out of it on GCP as well. On both the instance I am installing tpm2tools to read PCR values and what I have noticed is that on both QEMU and GCP, the PCR Values come out to be entirely different even though it is the Same VM Image

Shouldn't the PCR Values reported be same on both QEMU and GCP because essentially we are using the same VM Image ?

Deepansharora27 commented 1 year ago

My reasoning was that while I am importing that image onto GCP, I am enabling the UEFI Option through the feature flag and maybe that is causing the difference in PCR Values.

But I guess that is not the case and I am curious if this is not the case then why I am getting different PCR Values then ?