containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.39k stars 781 forks source link

Would be nice to have a different 'buildah pull' exit code for network failures #1499

Closed debarshiray closed 5 years ago

debarshiray commented 5 years ago

Description

Every once in a while, I come across people who failed to create a toolbox container because buildah pull errored out for some reason. For example:

$ toolbox create
toolbox: failed to pull base image fedora-toolbox:30

And if you use the --verbose flag you get to see the spew from buildah pull.

$ toolbox --verbose create
toolbox: Fedora generational core is f30
toolbox: base image is fedora-toolbox:30
toolbox: customized user-specific image is fedora-toolbox-jsgrant:30
toolbox: container is fedora-toolbox-jsgrant:30
toolbox: checking if image fedora-toolbox-jsgrant:30 already exists
ERRO[0000] exit status 1                                
toolbox: looking for image localhost/fedora-toolbox:30
Pulling docker://localhost/fedora-toolbox:30
ERRO[0000] exit status 1                                
toolbox: looking for image registry.fedoraproject.org/f30/fedora-toolbox:30
Pulling docker://registry.fedoraproject.org/f30/fedora-toolbox:30
ERRO[0000] exit status 1                                
toolbox: failed to pull base image fedora-toolbox:30

More often than not (almost always?) these are passing network errors. I have seen them myself once or twice, and trying it again gets you past the failure. It would be nice if toolbox could throw a more indicative error message that makes it clear that it was due to a network failure or such, instead of a generic string that makes people want to file a bug. :)

A brief reading of cmd/buildah/pull.go and pull.go makes me think that there's no specific exit code to indicate why the buildah pull failed, but I can't be sure because these failures are not deterministic.

Output of rpm -q buildah or apt list buildah:

buildah-1.8-13.dev.git3b497ff.fc29.x86_64

Output of buildah version:

Version:         1.8-dev
Go Version:      go1.11.5
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  
Git Commit:      
Built:           Thu Jan  1 01:00:00 1970
OS/Arch:         linux/amd64

*Output of `cat /etc/release`:**

NAME=Fedora
VERSION="29 (Workstation Edition)"
ID=fedora
VERSION_ID=29
VERSION_CODENAME=""
PLATFORM_ID="platform:f29"
PRETTY_NAME="Fedora 29 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:29"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f29/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=29
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=29
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation

Output of uname -a:

Linux kolache 4.20.15-200.fc29.x86_64 #1 SMP Mon Mar 11 16:01:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

[storage]
  driver = "overlay"
  runroot = "/run/user/1000"
  graphroot = "/home/rishi/.local/share/containers/storage"
  [storage.options]
    mount_program = "/usr/bin/fuse-overlayfs"
debarshiray commented 5 years ago

In this particular case, on Fedora 30, the buildah --debug pull output pointed at a possible TLS problem:

pinging docker registry returned: Get https://registry.fedoraproject.org/v2/: local error: tls: unexpected message

Logs in full:

$ buildah --debug pull registry.fedoraproject.org/f30/fedora-toolbox
[jsgrant@deskitute ~]$  buildah --debug pull registry.fedoraproject.org/f30/fedora-toolbox
DEBU[0000] running [buildah-in-a-user-namespace --debug pull registry.fedoraproject.org/f30/fedora-toolbox] with environment [SHELL=/bin/bash HISTCONTROL=ignoredups HISTSIZE=1000 HOSTNAME=deskitute XMODIFIERS=@im=ibus ENV=/usr/share/Modules/init/profile.sh PWD=/home/jsgrant LOGNAME=jsgrant XDG_SESSION_TYPE=tty MODULESHOME=/usr/share/Modules MANPATH=: HOME=/home/jsgrant LANG=en_US.UTF-8 LS_COLORS=rs=0:di=38;5;33:ln=38;5;51:mh=00:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;40:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.zst=38;5;9:*.tzst=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.wim=38;5;9:*.swm=38;5;9:*.dwm=38;5;9:*.esd=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.mjpg=38;5;13:*.mjpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.m4a=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.oga=38;5;45:*.opus=38;5;45:*.spx=38;5;45:*.xspf=38;5;45: SSH_CONNECTION=192.168.1.174 49344 192.168.1.212 22 MODULEPATH_modshare=/usr/share/modulefiles:1:/usr/share/Modules/modulefiles:1:/etc/modulefiles:1 XDG_SESSION_CLASS=user SELINUX_ROLE_REQUESTED= TERM=xterm-256color LESSOPEN=||/usr/bin/lesspipe.sh %s USER=jsgrant MODULES_RUN_QUARANTINE=LD_LIBRARY_PATH LOADEDMODULES= SELINUX_USE_CURRENT_RANGE= SHLVL=1 BASH_ENV=/usr/share/Modules/init/bash XDG_SESSION_ID=5 XDG_RUNTIME_DIR=/run/user/1000 SSH_CLIENT=192.168.1.174 49344 22 XDG_DATA_DIRS=/home/jsgrant/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share PATH=/usr/share/Modules/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin SELINUX_LEVEL_REQUESTED= MODULEPATH=/etc/scl/modulefiles:/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus MAIL=/var/spool/mail/jsgrant SSH_TTY=/dev/pts/3 MODULES_CMD=/usr/share/Modules/libexec/modulecmd.tcl BASH_FUNC_switchml%%=() {  typeset swfound=1;
 if [ "${MODULES_USE_COMPAT_VERSION:-0}" = '1' ]; then
 typeset swname='main';
 if [ -e /usr/share/Modules/libexec/modulecmd.tcl ]; then
 typeset swfound=0;
 unset MODULES_USE_COMPAT_VERSION;
 fi;
 else
 typeset swname='compatibility';
 if [ -e /usr/share/Modules/libexec/modulecmd-compat ]; then
 typeset swfound=0;
 MODULES_USE_COMPAT_VERSION=1;
 export MODULES_USE_COMPAT_VERSION;
 fi;
 fi;
 if [ $swfound -eq 0 ]; then
 echo "Switching to Modules $swname version";
 source /usr/share/Modules/init/bash;
 else
 echo "Cannot switch to Modules $swname version, command not found";
 return 1;
 fi
} BASH_FUNC_module%%=() {  _module_raw "$@" 2>&1
} BASH_FUNC_scl%%=() {  if [ "$1" = "load" -o "$1" = "unload" ]; then
 eval "module $@";
 else
 /usr/bin/scl "$@";
 fi
} BASH_FUNC__module_raw%%=() {  unset _mlshdbg;
 if [ "${MODULES_SILENT_SHELL_DEBUG:-0}" = '1' ]; then
 case "$-" in 
 *v*x*)
 set +vx;
 _mlshdbg='vx'
 ;;
 *v*)
 set +v;
 _mlshdbg='v'
 ;;
 *x*)
 set +x;
 _mlshdbg='x'
 ;;
 *)
 _mlshdbg=''
 ;;
 esac;
 fi;
 unset _mlre _mlIFS;
 if [ -n "${IFS+x}" ]; then
 _mlIFS=$IFS;
 fi;
 IFS=' ';
 for _mlv in ${MODULES_RUN_QUARANTINE:-};
 do
 if [ "${_mlv}" = "${_mlv##*[!A-Za-z0-9_]}" -a "${_mlv}" = "${_mlv#[0-9]}" ]; then
 if [ -n "`eval 'echo ${'$_mlv'+x}'`" ]; then
 _mlre="${_mlre:-}${_mlv}_modquar='`eval 'echo ${'$_mlv'}'`' ";
 fi;
 _mlrv="MODULES_RUNENV_${_mlv}";
 _mlre="${_mlre:-}${_mlv}='`eval 'echo ${'$_mlrv':-}'`' ";
 fi;
 done;
 if [ -n "${_mlre:-}" ]; then
 eval `eval ${_mlre}/usr/bin/tclsh /usr/share/Modules/libexec/modulecmd.tcl bash '"$@"'`;
 else
 eval `/usr/bin/tclsh /usr/share/Modules/libexec/modulecmd.tcl bash "$@"`;
 fi;
 _mlstatus=$?;
 if [ -n "${_mlIFS+x}" ]; then
 IFS=$_mlIFS;
 else
 unset IFS;
 fi;
 unset _mlre _mlv _mlrv _mlIFS;
 if [ -n "${_mlshdbg:-}" ]; then
 set -$_mlshdbg;
 fi;
 unset _mlshdbg;
 return $_mlstatus
} _=/usr/bin/buildah _BUILDAH_STARTED_IN_USERNS=1 BUILDAH_ISOLATION=rootless], UID map [{HostID:1000 ContainerID:0 Size:1} {HostID:100000 ContainerID:1 Size:65536}], and GID map [{HostID:1000 ContainerID:0 Size:1} {HostID:100000 ContainerID:1 Size:65536}] 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] error parsing image name "registry.fedoraproject.org/f30/fedora-toolbox", trying with transport "docker://": Invalid image name "registry.fedoraproject.org/f30/fedora-toolbox", expected colon-separated transport:reference 
Pulling docker://registry.fedoraproject.org/f30/fedora-toolbox
DEBU[0000] parsed image name "docker://registry.fedoraproject.org/f30/fedora-toolbox" 
DEBU[0000] registry "registry.fedoraproject.org" is not marked as blocked in registries configuration "/etc/containers/registries.conf" 
DEBU[0000] parsed reference into "[overlay@/var/home/jsgrant/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs]registry.fedoraproject.org/f30/fedora-toolbox:latest" 
DEBU[0000] parsed reference into "[overlay@/var/home/jsgrant/.local/share/containers/storage+/run/user/1000:overlay.mount_program=/usr/bin/fuse-overlayfs]registry.fedoraproject.org/f30/fedora-toolbox:latest" 
DEBU[0000] copying "docker://registry.fedoraproject.org/f30/fedora-toolbox" to "registry.fedoraproject.org/f30/fedora-toolbox:latest" 
DEBU[0000] Using registries.d directory /etc/containers/registries.d for sigstore configuration 
DEBU[0000]  Using "default-docker" configuration        
DEBU[0000]  No signature storage configuration found for registry.fedoraproject.org/f30/fedora-toolbox:latest 
DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/registry.fedoraproject.org 
DEBU[0000] Error creating parent directories for blob-info-cache-v1.boltdb, using a memory-only cache: mkdir /var/lib/containers: permission denied 
DEBU[0000] GET https://registry.fedoraproject.org/v2/   
DEBU[0000] Ping https://registry.fedoraproject.org/v2/ err Get https://registry.fedoraproject.org/v2/: local error: tls: unexpected message (&url.Error{Op:"Get", URL:"https://registry.fedoraproject.org/v2/", Err:(*net.OpError)(0xc0004f71d0)}) 
DEBU[0000] GET https://registry.fedoraproject.org/v1/_ping 
DEBU[0000] Ping https://registry.fedoraproject.org/v1/_ping err Get https://registry.fedoraproject.org/v1/_ping: local error: tls: unexpected message (&url.Error{Op:"Get", URL:"https://registry.fedoraproject.org/v1/_ping", Err:(*net.OpError)(0xc000300280)}) 
DEBU[0000] error copying src image ["docker://registry.fedoraproject.org/f30/fedora-toolbox"] to dest image ["registry.fedoraproject.org/f30/fedora-toolbox:latest"] err: Error determining manifest MIME type for docker://registry.fedoraproject.org/f30/fedora-toolbox:latest: pinging docker registry returned: Get https://registry.fedoraproject.org/v2/: local error: tls: unexpected message 
1 error occurred:
    * Error determining manifest MIME type for docker://registry.fedoraproject.org/f30/fedora-toolbox:latest: pinging docker registry returned: Get https://registry.fedoraproject.org/v2/: local error: tls: unexpected message
rhatdan commented 5 years ago

@ashley-cui Would you look into this?

rhatdan commented 5 years ago

@QiWang19 Could you look into this?

QiWang19 commented 5 years ago

error might be turned by buildah/vendor/github.com/containers/image/docker/docker_client.go. Does buildah return different exit code for different type of errors? I only see exit status 1

rhatdan commented 5 years ago

Currently no, but the request here is that we do, I believe.

QiWang19 commented 5 years ago

https://github.com/containers/buildah/issues/1504 agreed to close this, reopen if this happens again.

debarshiray commented 4 years ago

This was a feature request to use a different exit code to buildah pull to represent network failures.

1504 fixes a bug that could be abused to repeatedly trigger a network failure, but doesn't add any new exit codes to represent those situations.

debarshiray commented 4 years ago

I filed a similar feature request for podman pull too: https://github.com/containers/libpod/issues/6190