cirruslabs / cirrus-ci-docs

Documentation for Cirrus CI 📚
https://cirrus-ci.org
MIT License
349 stars 109 forks source link

Properly report startup script failures for VM-based tasks #728

Open sio opened 3 years ago

sio commented 3 years ago

Expected Behavior

CI runner should fail fast with a meaningful error message when container registry responds "401 Unauthorized"

Real Behavior

CI runner waited and waited, probably retrying multiple times. After 15 minutes job was cancelled and retried. Job was marked as failed only after 30 minutes with a message: "Agent is not responding!".

This error was completely on my part, but Cirrus CI should've communicated that better and faster.

I've discovered the reason of the error only after trying to run that container on another machine. docker run responded immediately with a helpful Error response from daemon: Get https://ghcr.io/...: unauthorized.. After I've changed container visibility to 'public' CI pipeline started working as expected.

Link: https://cirrus-ci.com/task/5512720764108800

fkorotkov commented 3 years ago

Is it a privileged/KVM-enabled container?

sio commented 3 years ago

Yes, KVM-enabled

fkorotkov commented 3 years ago

Got it. Yeah, because it's starting in a separate VM we don't have that much visibility if startup script failed. Changed the description.