monogon-dev / monogon

The Monogon Monorepo. May contain traces of peanuts and a ✨pure Go Linux userland✨. Work in progress!
https://monogon.tech
Apache License 2.0
378 stars 9 forks source link

//metropolis:launch-cluster and //metropolis:launch regression due to edk2 update #306

Closed leoluk closed 1 month ago

leoluk commented 1 month ago

Standard NixOS setup on 1000f7311c2f2af53120858fb68bfa248f504475 inside the nix-shell.

$ bazel run //metropolis:launch-cluster -- --help 
IntelliJ found at /home/leopold/.local/share/JetBrains/IntelliJIdea2024.1, aspect repository already patched.
INFO: Analyzed target //metropolis:launch-cluster (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //metropolis/test/launch/cli/launch-cluster:launch-cluster up-to-date:
  bazel-bin/metropolis/test/launch/cli/launch-cluster/launch-cluster_/launch-cluster
INFO: Elapsed time: 5.187s, Critical Path: 4.47s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/metropolis/test/launch/cli/launch-cluster/launch-cluster_/launch-cluster --help
TT|          test launch ! Node 1 logs at /tmp/cluster-4083377814/node-1.txt
TT|          test launch ! Cluster: Starting node 1...
TT|          test launch ! Cluster: generating node QCOW2 snapshot image: /home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/metropolis/node/image.img -> /tmp/cluster-4083377814/node_state2708915804/image.qcow2
TT|          test launch ! Starting to manufacture TPM for /tmp/cluster-4083377814/node_state2708915804/tpm... (&{Manufacturer:Monogon Version:1.0 Model:TestCluster})
TT|          test launch ! Successfully manufactured TPM for /tmp/cluster-4083377814/node_state2708915804/tpm
TT|          test launch ! Running node0:
TT|                      |   qemu-system-x86_64
TT|                      |   -machine q35
TT|                      |   -accel kvm
TT|                      |   -nographic -nodefaults -m 2048
TT|                      |   -cpu host
TT|                      |   -smp sockets=1,cpus=1,cores=2,threads=2,maxcpus=4
TT|                      |   -drive if=pflash,format=raw,readonly=on,file=/home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/external/edk2/OVMF_CODE.fd
TT|                      |   -drive if=pflash,format=raw,file=/tmp/cluster-4083377814/node_state2708915804/OVMF_VARS.fd
TT|                      |   -drive if=virtio,format=qcow2,cache=unsafe,file=/tmp/cluster-4083377814/node_state2708915804/image.qcow2
TT|                      |   -netdev socket,id=net0,fd=3
TT|                      |   -device virtio-net-pci,netdev=net0,mac=7a:72:58:ae:ac:5d
TT|                      |   -chardev socket,id=chrtpm,path=/tmp/cluster-1394351762/node_sock1894829596/tpm-socket
TT|                      |   -tpmdev emulator,id=tpm0,chardev=chrtpm
TT|                      |   -device tpm-tis,tpmdev=tpm0
TT|                      |   -device virtio-rng-pci
TT|                      |   -serial stdio
TT|                      |   -no-reboot -fw_cfg name=dev.monogon.metropolis/parameters.pb,file=/tmp/cluster-4083377814/node_state2708915804/parameters.pb
TT|                      |   -object filter-dump,id=net0,netdev=net0,file=/tmp/cluster-4083377814/node_state2708915804/net0.pcap
TT|                      |   
TT|          test launch ! Node: Starting...
TT|          test launch ! Nanoswitch logs at /tmp/cluster-4083377814/nanoswitch.txt
TT|          test launch ! Running nanoswitch:
TT|                      |   qemu-system-x86_64
TT|                      |   -nodefaults -no-user-config -nographic -no-reboot -accel kvm
TT|                      |   -cpu host
TT|                      |   -m 1G
TT|                      |   -bios /home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/external/com_github_bonzini_qboot/bios.bin
TT|                      |   -M microvm,x-option-roms=off,pic=off,pit=off,rtc=off,isa-serial=off
TT|                      |   -kernel /home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild-ST-844af81647b0/bin/osbase/test/ktest/vmlinux
TT|                      |   -append reboot=t console=hvc0 quiet 
TT|                      |   -initrd /home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/metropolis/test/nanoswitch/initramfs.cpio.zst
TT|                      |   -device virtio-rng-device,max-bytes=1024,period=1000
TT|                      |   -device virtio-serial-device,max_ports=16
TT|                      |   -chardev stdio,id=con0
TT|                      |   -device virtconsole,chardev=con0
TT|                      |   -netdev user,hostfwd=tcp::39491-:6443,hostfwd=tcp::42227-:6444,hostfwd=tcp::38971-:1080,hostfwd=tcp::46489-:7835,hostfwd=tcp::42385-:7837,id=usernet0,net=10.42.0.0/24,dhcpstart=10.42.0.10
TT|                      |   -device virtio-net-device,netdev=usernet0,mac=02:72:82:bf:c3:56
TT|                      |   -netdev socket,id=net0,fd=3
TT|                      |   -device virtio-net-device,netdev=net0
TT|                      |   -netdev socket,id=net1,fd=4
TT|                      |   -device virtio-net-device,netdev=net1
TT|                      |   -netdev socket,id=net2,fd=5
TT|                      |   -device virtio-net-device,netdev=net2
TT|                      |   -object filter-dump,id=usernet0,netdev=usernet0,file=/tmp/cluster-4083377814/nanoswitch.pcap
TT|                      |   
TT|          test launch ! Cluster: retrieving owner certificate (this can take a few seconds while the first node boots)...
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"
TT|          test launch ! Cluster: cluster UNAVAILABLE: Escrow call failed: connection error: desc = "transport: Error while dialing: socks connect tcp localhost:38971->10.1.0.2:7835: dial tcp [::1]:38971: connect: connection refused"

$ cat /tmp/cluster-735349292/node-1.txt | cat -v 
^[[2J^[[01;01H^[[=3h^[[2J^[[01;01H^[[2J^[[01;01H^[[=3h^[[2J^[[01;01HBdsDxe: loading Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x3,0x0)^M
BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x3,0x0)^M
[TRACE]: external/crate_index_efi__uefi-0.24.0/src/fs/file_system/fs.rs@327: Can't open file \EFI\metropolis\loader_state.pb: Error { status: NOT_FOUND, data: () }^M
Unable to load A/B loader state, using default slot A: while reading state file: IO error^M
Booting into Slot A^M

$ cat /tmp/cluster-735349292/nanoswitch.txt
          supervisor I supervisor processor started
                root I Starting NanoSwitch, a tiny TOR switch emulator
                root I Assigned interface eth1 to bridge
                root I Assigned interface eth2 to bridge
                root I Assigned interface eth3 to bridge
         dhcp-client I DISCOVERING => REQUESTING
         dhcp-client I REQUESTING => BOUND
$ bazel run //metropolis:launch | cat -v 
IntelliJ found at /home/leopold/.local/share/JetBrains/IntelliJIdea2024.1, aspect repository already patched.
Computing main repo mapping: 
Loading: 
Loading: 0 packages loaded
Analyzing: target //metropolis:launch (0 packages loaded, 0 targets configured)
Analyzing: target //metropolis:launch (0 packages loaded, 0 targets configured)
[0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
INFO: Analyzed target //metropolis:launch (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //metropolis/test/launch/cli/launch:launch up-to-date:
  bazel-bin/metropolis/test/launch/cli/launch/launch_/launch
INFO: Elapsed time: 0.498s, Critical Path: 0.10s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/metropolis/test/launch/cli/launch/launch_/launch
TT|          test launch ! Cluster: generating node QCOW2 snapshot image: /home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/metropolis/node/image.img -> /tmp/node_state2675665491/node_state959694878/image.qcow2
TT|          test launch ! Starting to manufacture TPM for /tmp/node_state2675665491/node_state959694878/tpm... (&{Manufacturer:Monogon Version:1.0 Model:TestCluster})
TT|          test launch ! Successfully manufactured TPM for /tmp/node_state2675665491/node_state959694878/tpm
TT|          test launch ! Running test-node:
TT|                      |   qemu-system-x86_64
TT|                      |   -machine q35
TT|                      |   -accel kvm
TT|                      |   -nographic -nodefaults -m 2048
TT|                      |   -cpu host
TT|                      |   -smp sockets=1,cpus=1,cores=2,threads=2,maxcpus=4
TT|                      |   -drive if=pflash,format=raw,readonly=on,file=/home/leopold/.cache/bazel/_bazel_leopold/aaaf0322e22d06a18d1422185a3fd3a8/execroot/_main/bazel-out/k8-fastbuild/bin/external/edk2/OVMF_CODE.fd
TT|                      |   -drive if=pflash,format=raw,file=/tmp/node_state2675665491/node_state959694878/OVMF_VARS.fd
TT|                      |   -drive if=virtio,format=qcow2,cache=unsafe,file=/tmp/node_state2675665491/node_state959694878/image.qcow2
TT|                      |   -netdev user,net=10.42.0.0/24,dhcpstart=10.42.0.10,hostfwd=tcp::7834-:7834,hostfwd=tcp::7835-:7835,hostfwd=tcp::7837-:7837,hostfwd=tcp::6443-:6443,hostfwd=tcp::6444-:6444,hostfwd=tcp::2345-:2345,hostfwd=tcp::7840-:7840,id=net0
TT|                      |   -device virtio-net-pci,netdev=net0,mac=fe:1b:76:50:51:0f
TT|                      |   -chardev socket,id=chrtpm,path=/tmp/node_sock3584756411/node_sock2907104668/tpm-socket
TT|                      |   -tpmdev emulator,id=tpm0,chardev=chrtpm
TT|                      |   -device tpm-tis,tpmdev=tpm0
TT|                      |   -device virtio-rng-pci
TT|                      |   -serial stdio
TT|                      |   -no-reboot -fw_cfg name=dev.monogon.metropolis/parameters.pb,file=/tmp/node_state2675665491/node_state959694878/parameters.pb
TT|                      |   
TT|          test launch ! Node: Starting...
^[[2J^[[01;01H^[[=3h^[[2J^[[01;01H^[[2J^[[01;01H^[[=3h^[[2J^[[01;01HBdsDxe: loading Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x3,0x0)^M
BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x3,0x0)^M
[TRACE]: external/crate_index_efi__uefi-0.24.0/src/fs/file_system/fs.rs@327: Can't open file \EFI\metropolis\loader_state.pb: Error { status: NOT_FOUND, data: () }^M
Unable to load A/B loader state, using default slot A: while reading state file: IO error^M
Booting into Slot A^M

bazel test //... passes.