OSInside / kiwi

KIWI - Appliance Builder Next Generation
https://osinside.github.io/kiwi
GNU General Public License v3.0
289 stars 146 forks source link

Add new builder for enclaves #2586

Open schaefi opened 1 month ago

schaefi commented 1 month ago

Add new EnclaveBuilder class which allows to build initrd-only image types. The first enclave implementation covers aws-nitro images produced via the eif-cli tooling.

This pull request is open for review but not yet ready for merge because of missing dependencies. The following tasks must be completed prior merge

I'd like to gather feedback in an early state of the implementation. For building the provided integration test checkout the git and switch to the nitro_enclaves branch. From the branch call

sudo poetry run kiwi-ng system build --description build-tests/x86/tumbleweed/test-image-nitro-enclave/ --target-dir /tmp/mytest --set-repo http://download.opensuse.org/tumbleweed/repo/oss

The produced results from the builder currently only consists out of the initrd and the kernel as well as a warning message which indicates the missing eif-cli commandline.

Feedback welcome. Thanks in advance

schaefi commented 1 month ago

Questions from my side:

Conan-Kudo commented 1 month ago

I'm working through packaging the tooling in Fedora, currently there are some issues with the crate dependencies that need fixing.

agraf commented 2 weeks ago

Oh, I didn't see the questions before! Let me reply :)

How is the entry point controlled, meaning how do we specify which command is actually invoked and connected to the vsock ?

I think for typical kiwi builds, we should just rely on systemd services. So if you want to spawn your own daemon on boot, you just add it as a systemd service and mark it as enabled. Recent versions of systemd also provide a vsock proxy to make it easy to run sshd on vsock: client, server

I'm aiming for a more useful integration test, any idea what would be a good and simple setup would be great. If we can't find a good example let's try to make the integration in a way that sshd gets connected to a vsock such that we can debug a shell to the enclave. This will be a relatively big system though

We're working on a QEMU implementation of Nitro Enclaves that should allow us to run the integration test as part of a CI loop anywhere. If we rely on Nitro Enclaves in EC2, you would need to execute the test in EC2 which may be prohibitive for a sensible test loop.

I think an example kiwi file that enables sshd using the proxy above would be ideal, yes. That example file also then will provide a great template for people that want to start from a larger, full blown OS Enclave.

agraf commented 1 week ago

Using the QEMU fork above, I was able to run a Nitro Enclave using this patch. To do that, I had to manually run eif_build

$ eif_build --kernel *kernel --ramdisk *initrd.xz --cmdline 'reboot=k panic=30 pci=off console=ttyS0 i8042.noaux i8042.nomux i8042.nopnp i8042.dumbkbd random.trust_cpu=on rdinit=/sbin/init' --output img.eif

to create an EIF. The command line above should be part of the default kiwi file. We need rdinit= because the kiwi generated initramfs contains no /init binary. We want to use systemd, so we have to point it to systemd which resides in /sbin/init.

After that, I could run the Enclave in QEMU:

$ qemu-system-x86_64 -M nitro-enclave,vsock=c -nographic -kernel ~/kiwi/img/img2.eif -m 4G -chardev socket,id=c,path=/tmp/vhost4.socket
[...]
Welcome to openSUSE Tumbleweed!
[...]
localhost login:

What is still missing?

  1. A vsock ping on 3:9000. We basically need a small startup service that invokes echo -en '\xb7' | socat - VSOCK-CONNECT:3:9000 on boot so that the Nitro Enclave tooling knows the VM is up.
  2. Integration of the eif_build step in kiwi
  3. Provide a default kernel command line in the kiwi template
  4. Automatically load the virtio_mmio kernel module on boot
  5. An easy way to provide an ssh vsock proxy maybe? Not really necessary, but nice to have for debugging.
agraf commented 1 week ago

I've also spotted a minor bug in our Linux kernel loader that makes it reject the openSUSE kernel because of its appended signature. I'll fix this in EC2. Please test with the QEMU target above meanwhile.

agraf commented 1 week ago

I've built a small "I'm alive" package here: https://build.opensuse.org/package/show/home:agraf/nitro-enclave-alive

agraf commented 1 week ago

I've fiddled a bit with the code and extended it to a point where I can successfully build a Nitro Enclave .eif file through kiwi: https://github.com/agraf/kiwi/commit/02deb8c36562bbe4d5fa4d942ddbee1aa311b348

It doesn't make ssh listen on vsock yet, but basic functionality is there. It also requires a tiny bug fix to EC2 which is not deployed to production systems yet. Find a boot log below:

$ sudo nitro-cli run-enclave --debug-mode --memory 4096  --eif-path ~/kiwi/img/*eif --cpu-count 2 --attach-console
Start allocating memory...
Started enclave with enclave-cid: 16, memory: 4096 MiB, cpu-ids: [1, 9]
{
  "EnclaveName": "kiwi-test-image-nitro-enclave.x86_64-1.1.1",
  "EnclaveID": "i-07d7ec2d26b1374ab-enc1916cd3ea6dfa1f",
  "ProcessID": 30872,
  "EnclaveCID": 16,
  "NumberOfCPUs": 2,
  "CPUIDs": [
    1,
    9
  ],
  "MemoryMiB": 4096
}
Connecting to the console for enclave 16...
Successfully connected to the console.
 T1] ledtrig-cpu: registered to indicate activity on CPUs
[   12.260223][    T1] hid: raw HID events driver (C) Jiri Kosina
[   12.260790][    T1] drop_monitor: Initializing network drop monitor service
[   12.261501][    T1] NET: Registered PF_INET6 protocol family
[   12.268032][    T1] Segment Routing with IPv6
[   12.268430][    T1] RPL Segment Routing with IPv6
[   12.268859][    T1] In-situ OAM (IOAM) with IPv6
[   12.269722][    T1] IPI shorthand broadcast: enabled
[   12.272098][    T1] sched_clock: Marking stable (12206672713, 63608442)->(12268614590, 1666565)
[   12.273004][    T1] Timer migration: 1 hierarchy levels; 8 children per group; 1 crossnode level
[   12.273846][    T1] registered taskstats version 1
[   12.274451][    T1] Loading compiled-in X.509 certificates
[   12.274967][    T1] Loaded X.509 cert 'openSUSE Secure Boot Signkey: fd9f2c12e599d67cc7f9067541adf426b712469e'
[   12.278208][    T1] Demotion targets for Node 0: null
[   12.278665][    T1] page_owner is disabled
[   12.279106][    T1] Key type .fscrypt registered
[   12.279522][    T1] Key type fscrypt-provisioning registered
[   12.302533][    T1] Key type encrypted registered
[   12.302962][    T1] AppArmor: AppArmor sha256 policy hashing enabled
[   12.303531][    T1] ima: No TPM chip found, activating TPM-bypass!
[   12.304123][    T1] Loading compiled-in module X.509 certificates
[   12.304680][    T1] Loaded X.509 cert 'openSUSE Secure Boot Signkey: fd9f2c12e599d67cc7f9067541adf426b712469e'
[   12.305552][    T1] ima: Allocated hash algorithm: sha256
[   12.306039][    T1] ima: No architecture policies found
[   12.306524][    T1] evm: Initialising EVM extended attributes:
[   12.307076][    T1] evm: security.selinux
[   12.307435][    T1] evm: security.SMACK64 (disabled)
[   12.307875][    T1] evm: security.SMACK64EXEC (disabled)
[   12.308349][    T1] evm: security.SMACK64TRANSMUTE (disabled)
[   12.308855][    T1] evm: security.SMACK64MMAP (disabled)
[   12.309325][    T1] evm: security.apparmor
[   12.309690][    T1] evm: security.ima
[   12.310018][    T1] evm: security.capability
[   12.310411][    T1] evm: HMAC attrs: 0x1
[   12.438723][    T1] PM:   Magic number: 0:110:269243
[   12.443592][    T1] RAS: Correctable Errors collector initialized.
[   12.456012][    T1] clk: Disabling unused clocks
[   12.456436][    T1] PM: genpd: Disabling unused power domains
[   12.459229][    T1] Freeing unused decrypted memory: 2028K
[   12.461040][    T1] Freeing unused kernel image (initmem) memory: 4244K
[   12.461401][    T1] Write protecting the kernel read-only data: 30720k
[   12.462559][    T1] Freeing unused kernel image (rodata/data gap) memory: 1204K
[   12.462917][    T1] Run /sbin/init as init process
[   12.463185][    T1] Not activating Mandatory Access Control as /sbin/tomoyo-init does not exist.
[   12.537294][    T1] systemd[1]: systemd 256.4+suse.6.g5bba1ebe17 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA -SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBCRYPTSETUP_PLUGINS +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK -XKBCOMMON -UTMP +SYSVINIT +LIBARCHIVE)
[   12.538901][    T1] systemd[1]: Detected virtualization kvm.
[   12.539196][    T1] systemd[1]: Detected architecture x86-64.

Welcome to openSUSE Tumbleweed!

[   12.540086][    T1] systemd[1]: No hostname configured, using default hostname.
[   12.540548][    T1] systemd[1]: Hostname set to <localhost>.
[   12.540918][    T1] systemd[1]: Initializing machine ID from random generator.
[   12.683485][    T1] systemd[1]: bpf-restrict-fs: LSM BPF program attached
[   12.739778][    T1] systemd[1]: Queued start job for default target Graphical Interface.
[   12.764519][    T1] systemd[1]: Created slice Slice /system/getty.
[  OK  ] Created slice Slice /system/getty.
[   12.765629][    T1] systemd[1]: Created slice Slice /system/modprobe.
[  OK  ] Created slice Slice /system/modprobe.
[   12.766675][    T1] systemd[1]: Created slice Slice /system/serial-getty.
[  OK  ] Created slice Slice /system/serial-getty.
[   12.767695][    T1] systemd[1]: Created slice User and Session Slice.
[  OK  ] Created slice User and Session Slice.
[   12.768542][    T1] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[   12.769718][    T1] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[  OK  ] Set up automount Arbitrary Executa…ormats File System Automount Point.
[   12.770846][    T1] systemd[1]: Expecting device /dev/ttyS0...
         Expecting device /dev/ttyS0...
[   12.771505][    T1] systemd[1]: Reached target Local Encrypted Volumes.
[  OK  ] Reached target Local Encrypted Volumes.
[   12.772300][    T1] systemd[1]: Reached target Local Integrity Protected Volumes.
[  OK  ] Reached target Local Integrity Protected Volumes.
[   12.773197][    T1] systemd[1]: Reached target Remote File Systems.
[  OK  ] Reached target Remote File Systems.
[   12.773955][    T1] systemd[1]: Reached target Slice Units.
[  OK  ] Reached target Slice Units.
[   12.774635][    T1] systemd[1]: Reached target Swaps.
[  OK  ] Reached target Swaps.
[   12.775265][    T1] systemd[1]: Reached target Local Verity Protected Volumes.
[  OK  ] Reached target Local Verity Protected Volumes.
[   12.776875][    T1] systemd[1]: Listening on Process Core Dump Socket.
[  OK  ] Listening on Process Core Dump Socket.
[   12.778110][    T1] systemd[1]: Listening on Credential Encryption/Decryption.
[  OK  ] Listening on Credential Encryption/Decryption.
[   12.779072][    T1] systemd[1]: Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket (/dev/log).
[   12.779948][    T1] systemd[1]: Listening on Journal Sockets.
[  OK  ] Listening on Journal Sockets.
[   12.780754][    T1] systemd[1]: Listening on udev Control Socket.
[  OK  ] Listening on udev Control Socket.
[   12.781555][    T1] systemd[1]: Listening on udev Kernel Socket.
[  OK  ] Listening on udev Kernel Socket.
[   12.782852][    T1] systemd[1]: Mounting Huge Pages File System...
         Mounting Huge Pages File System...
[   12.787010][    T1] systemd[1]: Mounting POSIX Message Queue File System...
         Mounting POSIX Message Queue File System...
[   12.811388][    T1] systemd[1]: Mounting Kernel Debug File System...
         Mounting Kernel Debug File System...
[   12.817019][    T1] systemd[1]: Mounting Kernel Trace File System...
         Mounting Kernel Trace File System...
[   12.820391][    T1] systemd[1]: Mounting Temporary Directory /tmp...
         Mounting Temporary Directory /tmp...
[   12.822959][    T1] systemd[1]: Starting Create List of Static Device Nodes...
         Starting Create List of Static Device Nodes...
[   12.827062][    T1] systemd[1]: Starting Load Kernel Module configfs...
         Starting Load Kernel Module configfs...
[   12.830393][    T1] systemd[1]: Starting Load Kernel Module dm_mod...
         Starting Load Kernel Module dm_mod...
[   12.840395][    T1] systemd[1]: Starting Load Kernel Module drm...
         Starting Load Kernel Module drm...
[   12.843110][    T1] systemd[1]: Starting Load Kernel Module efi_pstore...
         Starting Load Kernel Module efi_pstore...
[   12.847314][    T1] systemd[1]: Starting Load Kernel Module fuse...
         Starting Load Kernel Module fuse...
[   12.851903][    T1] systemd[1]: Starting Load Kernel Module loop...
         Starting Load Kernel Module loop...
[   12.855616][    T1] systemd[1]: Clear Stale Hibernate Storage Info was skipped because of an unmet condition check (ConditionPathExists=/sys/firmware/efi/efivars/HibernateLocation-8cf2644b-4b0b-428f-9387-6d876050dc67).
[   12.857576][  T164] device-mapper: uevent: version 1.0.3
[   12.857675][  T164] device-mapper: ioctl: 4.48.0-ioctl (2023-03-01) initialised: dm-devel@lists.linux.dev
[   12.863241][    T1] systemd[1]: Starting Journal Service...
         Starting Journal Service...
[   12.867057][    T1] systemd[1]: Starting Load Kernel Modules...
         Starting Load Kernel Modules...
[   12.875647][    T1] systemd[1]: Starting Remount Root and Kernel File Systems...
         Starting Remount Root and Kernel File Systems...
[   12.880014][    T1] systemd[1]: Starting Load udev Rules from Credentials...
         Starting Load udev Rules from Credentials...
[   12.886445][    T1] systemd[1]: Starting Coldplug All udev Devices...
         Starting Coldplug All udev Devices...
[   12.895541][    T1] systemd[1]: Mounted Huge Pages File System.
[  OK  ] Mounted Huge Pages File System.
[   12.897427][  T171] virtio-mmio: Registering device virtio-mmio.0 at 0xd0000000-0xd0000fff, IRQ 5.
[   12.898002][  T171] virtio-mmio: Registering device virtio-mmio.1 at 0xd0001000-0xd0001fff, IRQ 6.
[   12.898320][  T167] fuse: init (API version 7.40)
[   12.900343][    T1] systemd[1]: Mounted POSIX Message Queue File System.
[  OK  ] Mounted POSIX Message Queue File System.
[   12.903091][  T169] loop: module loaded
[   12.903946][    T1] systemd[1]: Mounted Kernel Debug File System.
[  OK  ] Mounted Kernel Debug File System.
[   12.905028][    T1] systemd[1]: Mounted Kernel Trace File System.
[  OK  ] Mounted Kernel Trace File System.
[   12.906091][    T1] systemd[1]: Mounted Temporary Directory /tmp.
[  OK  ] Mounted Temporary Directory /tmp.
[   12.907280][    T1] systemd[1]: Finished Create List of Static Device Nodes.
[  OK  ] Finished Create List of Static Device Nodes.
[   12.908731][    T1] systemd[1]: modprobe@configfs.service: Deactivated successfully.
[   12.909379][    T1] systemd[1]: Finished Load Kernel Module configfs.
[  OK  ] Finished Load Kernel Module configfs.
[   12.910685][    T1] systemd[1]: modprobe@dm_mod.service: Deactivated successfully.
[   12.911305][    T1] systemd[1]: Finished Load Kernel Module dm_mod.
[  OK  ] Finished Load Kernel Module dm_mod.
[   12.912576][    T1] systemd[1]: modprobe@drm.service: Deactivated successfully.
[   12.913189][    T1] systemd[1]: Finished Load Kernel Module drm.
[  OK  ] Finished Load Kernel Module drm.
[   12.914584][    T1] systemd[1]: modprobe@efi_pstore.service: Deactivated successfully.
[   12.915199][    T1] systemd[1]: Finished Load Kernel Module efi_pstore.
[   12.915578][  T170] systemd-journald[170]: Collecting audit messages is disabled.
[  OK  ] Finished Load Kernel Module efi_pstore.
[   12.920729][    T1] systemd[1]: modprobe@fuse.service: Deactivated successfully.
[   12.921339][    T1] systemd[1]: Finished Load Kernel Module fuse.
[  OK  ] Finished Load Kernel Module fuse.
[   12.922622][    T1] systemd[1]: modprobe@loop.service: Deactivated successfully.
[   12.923232][    T1] systemd[1]: Finished Load Kernel Module loop.
[  OK  ] Finished Load Kernel Module loop.
[   12.927101][    T1] systemd[1]: Started Journal Service.
[  OK  ] Started Journal Service.
[  OK  ] Finished Load Kernel Modules.
[  OK  ] Finished Remount Root and Kernel File Systems.
[  OK  ] Finished Load udev Rules from Credentials.
         Mounting FUSE Control File System...
         Mounting Kernel Configuration File System...
         Starting Apply Kernel Variables for 6.10.5-1-default...
         Starting Rebuild Hardware Database...
         Starting Flush Journal to Persistent Storage...
         Starting Load/Save OS Random Seed...
         Starting Create Static Device Nodes in /dev gracefully...
[  OK  ] Mounted FUSE Control File System.
[  OK  ] Mounted Kernel Configuration File System.
[  OK  ] Finished Apply Kernel Variables for 6.10.5-1-default.
[   12.990790][  T170] systemd-journald[170]: Received client request to flush runtime journal.
         Starting Apply Kernel Variables...
[  OK  ] Finished Load/Save OS Random Seed.
[  OK  ] Finished Flush Journal to Persistent Storage.
[  OK  ] Finished Coldplug All udev Devices.
[  OK  ] Finished Create Static Device Nodes in /dev gracefully.
[  OK  ] Finished Apply Kernel Variables.
         Starting Create System Users...
[  OK  ] Finished Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Create Static Device Nodes in /dev.
[  OK  ] Reached target Preparation for Local File Systems.
[  OK  ] Reached target Local File Systems.
[  OK  ] Listening on Boot Entries Service Socket.
[  OK  ] Listening on System Extension Image Management.
         Starting Create System Files and Directories...
[  OK  ] Finished Create System Files and Directories.
         Starting Rebuild Journal Catalog...
[  OK  ] Finished Rebuild Journal Catalog.
[  OK  ] Finished Rebuild Hardware Database.
         Starting Rule-based Manager for Device Events and Files...
         Starting Update is Completed...
[  OK  ] Finished Update is Completed.
[  OK  ] Started Rule-based Manager for Device Events and Files.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Watch for changes in CA certificates.
[  OK  ] Started Discard unused filesystem blocks once a week.
[  OK  ] Started Daily rotation of log files.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Path Units.
[  OK  ] Reached target Timer Units.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on Hostname Service Socket.
[  OK  ] Reached target Socket Units.
[  OK  ] Reached target Basic System.
         Starting D-Bus System Message Bus...
         Starting Restore /run/initramfs on shutdown...
[   13.406414][  T228] NET: Registered PF_VSOCK protocol family
         Starting Apply settings from /etc/sysconfig/keyboard...
[  OK  ] Started Nitro Enclave Alive Service.
         Starting User Login Management...
         Starting Permit User Sessions...
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Finished Restore /run/initramfs on shutdown.
[  OK  ] Found device /dev/ttyS0.
[  OK  ] Finished Permit User Sessions.
[   13.516511][  T239] input: PC Speaker as /devices/platform/pcspkr/input/input0
[   13.531483][  T229] cryptd: max_cpu_qlen set to 1000
[  OK  ] Started Getty on tty1.
[   13.544990][  T250] AVX2 version of gcm_enc/dec engaged.
[   13.545502][  T250] AES CTR mode by8 optimization enabled
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started User Login Management.
[  OK  ] Stopped Apply settings from /etc/sysconfig/keyboard.
         Starting Virtual Console Setup...
[  OK  ] Finished Virtual Console Setup.
         Starting Apply settings from /etc/sysconfig/keyboard...
[  OK  ] Finished Apply settings from /etc/sysconfig/keyboard.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
agraf commented 1 week ago

Turns out enabling sshd is not terribly difficult either. In Tumbleweed, the systemd-experimental package contains the vsock generator which automatically listens on port 22 for ssh requests. If we also enable sshd (to generate host keys) and allow root logins (to ssh as root), we can use an sshd config like this to access sshd inside the Enclave: https://github.com/agraf/kiwi/commit/34266f72ebd6090681ad7332f2acee649e4c9ad8

~/.ssh/config

host *.vsock
  ProxyCommand ~/bin/vsock-ssh.sh %h

~/bin/vsock-ssh.sh

#!/bin/bash

CID=$(echo "$1" | cut -d . -f 1)

socat - VSOCK-CONNECT:$CID:22

Test output

$ ssh root@21.vsock
The authenticity of host '21.vsock (<no hostip for proxy command>)' can't be established.
ECDSA key fingerprint is SHA256:ad7fGMQvnb5U38wbIG9O0gXRkg/B2BTZCYtMytoJ00A.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '21.vsock' (ECDSA) to the list of known hosts.
Password:
Have a lot of fun...
localhost:~ # cat /etc/os-release
NAME="openSUSE Tumbleweed"
# VERSION="20240818"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20240818"
PRETTY_NAME="openSUSE Tumbleweed"
ANSI_COLOR="0;32"
# CPE 2.3 format, boo#1217921
CPE_NAME="cpe:2.3:o:opensuse:tumbleweed:20240818:*:*:*:*:*:*:*"
#CPE 2.2 format
#CPE_NAME="cpe:/o:opensuse:tumbleweed:20240818"
BUG_REPORT_URL="https://bugzilla.opensuse.org"
SUPPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Tumbleweed"
LOGO="distributor-logo-Tumbleweed"
schaefi commented 3 days ago

@agraf great investigation, tests and next steps list :+1: I will adapt the missing parts once I'm back from vacation next week and after the cloud work that needs to be finished first