rootless-containers / usernetes

Kubernetes without the root privileges
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2033-kubelet-in-userns-aka-rootless
Apache License 2.0
865 stars 58 forks source link

Exiting early? #310

Closed vsoch closed 10 months ago

vsoch commented 10 months ago

Apologies if this is a bad question. I'm building VMs with usernetes and flux, and the entire VM does seem to build successfully in that nothing is missing and everything works. The issue I'm hitting is that it always exits early. E.g., I see:

WARN[0000] `mountType: virtiofs` on Linux is experimental 
INFO[0000] Attempting to download the image              arch=x86_64 digest= location="https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
INFO[0001] Using cache "/home/vanessa/.cache/lima/download/by-url-sha256/6b15519b255a45a238b7a8154cd57da120344ea388143af2821bb790af7fc587/data" 
INFO[0001] [hostagent] hostagent socket created at /home/vanessa/.lima/flux-1/ha.sock 
INFO[0001] [hostagent] Starting QEMU (hint: to watch the boot progress, see "/home/vanessa/.lima/flux-1/serial*.log") 
INFO[0001] SSH Local Port: 43741                        
INFO[0001] [hostagent] Waiting for the essential requirement 1 of 3: "ssh" 
INFO[0011] [hostagent] Waiting for the essential requirement 1 of 3: "ssh" 
INFO[0021] [hostagent] Waiting for the essential requirement 1 of 3: "ssh" 
INFO[0022] [hostagent] The essential requirement 1 of 3 is satisfied 
INFO[0022] [hostagent] Waiting for the essential requirement 2 of 3: "user session is ready for ssh" 
INFO[0022] [hostagent] The essential requirement 2 of 3 is satisfied 
INFO[0022] [hostagent] Waiting for the essential requirement 3 of 3: "the guest agent to be running" 
INFO[0022] [hostagent] The essential requirement 3 of 3 is satisfied 
INFO[0022] [hostagent] Waiting for the optional requirement 1 of 1: "user probe 1/1" 
INFO[0022] [hostagent] Forwarding "/run/lima-guestagent.sock" (guest) to "/home/vanessa/.lima/flux-1/ga.sock" (host) 
INFO[0022] [hostagent] Not forwarding TCP 127.0.0.53:53 
INFO[0022] [hostagent] Not forwarding TCP 0.0.0.0:22    
INFO[0022] [hostagent] Not forwarding TCP [::]:22       
FATA[0601] did not receive an event with the "running" status 

But then I look in the logs, and it's still running. Before this was an issue with my probe - it had a timeout that was being hit. But now I have a probe that looks for a file that isn't generated until the last user provision block with a generous timeout:

probes:
- script: |
    #!/bin/bash
    set -eux -o pipefail
    if ! timeout 1200s bash -c "until test -f /tmp/finished.txt; do sleep 10; done"; then
        echo >&2 "build is finished"
        exit 1
    fi
  hint: |
    build is finished.

Is cloud init running things that the same time? Should I have a specific probe that checks for logic in each block to ensure that each block finishes? Thanks for the advice!

vsoch commented 10 months ago

oops I meant to post this on lima - let me fix that :)