pytorch / builder

Continuous builder and binary build scripts for pytorch
BSD 2-Clause "Simplified" License
323 stars 213 forks source link

CentOS Docker failures: mirrorlist.centos.org no longer online #1905

Closed atalman closed 4 days ago

atalman commented 4 days ago

Failure: https://github.com/pytorch/builder/actions/runs/9716034788/job/26898991378#step:4:80

Docker build failing:

#11 [common  2/18] RUN yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm
#11 0.552 Loaded plugins: fastestmirror, ovl
#11 0.658 Determining fastest mirrors
#11 0.671 Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container error was
#11 0.671 14: curl#6 - "Could not resolve host: mirrorlist.centos.org; Unknown error"
#11 0.673 
#11 0.673 
#11 0.673  One of the configured repositories failed (Unknown),
#11 0.673  and yum doesn't have enough cached data to continue. At this point the only
#11 0.673  safe thing yum can do is fail. There are a few ways to work "fix" this:
#11 0.673 
#11 0.673      1. Contact the upstream for the repository and get them to fix the problem.
#11 0.673 
#11 0.673      2. Reconfigure the baseurl/etc. for the repository, to point to a working
#11 0.673         upstream. This is most often useful if you are using a newer
#11 0.673         distribution release than is supported by the repository (and the
#11 0.673         packages for the previous distribution release still work).
#11 0.673 
#11 0.673      3. Run the command with the repository temporarily disabled
#11 0.673             yum --disablerepo=<repoid> ...
#11 0.673 
#11 0.673      4. Disable the repository permanently, so yum won't use it by default. Yum
#11 0.673         will then just ignore the repository until you permanently enable it
#11 0.673         again or use --enablerepo for temporary usage:
#11 0.673 
#11 0.673             yum-config-manager --disable <repoid>
#11 0.673         or
#11 0.673             subscription-manager repos --disable=<repoid>
#11 0.673 
#11 0.673      5. Configure the failing repository to be skipped, if it is unavailable.
#11 0.673         Note that yum will try to contact the repo. when it runs most commands,
#11 0.673         so will have to try and fail each time (and thus. yum will be be much
#11 0.673         slower). If it is a very temporary problem though, this is often a nice
#11 0.673         compromise:
#11 0.673 
#11 0.673             yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
#11 0.673 
#11 0.673 Cannot find a valid baseurl for repo: base/7/x86_64
#11 ERROR: process "/bin/sh -c yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm" did not complete successfully: exit code: 1

#12 [base 2/9] RUN yum install -y wget curl perl util-linux xz bzip2 git patch which perl zlib-devel
#12 0.333 Loaded plugins: fastestmirror, ovl
#12 0.653 Determining fastest mirrors
#12 CANCELED
------
 > [common  2/18] RUN yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm:
0.673 
0.673      5. Configure the failing repository to be skipped, if it is unavailable.
0.673         Note that yum will try to contact the repo. when it runs most commands,
0.673         so will have to try and fail each time (and thus. yum will be be much
0.673         slower). If it is a very temporary problem though, this is often a nice
0.673         compromise:
0.673 
0.673             yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
0.673 
0.673 Cannot find a valid baseurl for repo: base/7/x86_64
------
Dockerfile:86
--------------------
  [85](https://github.com/pytorch/builder/actions/runs/9716034788/job/26898991378#step:4:86) |     ENV LANGUAGE en_US.UTF-8
  86 | >>> RUN yum install -y \
  87 | >>>         aclocal \
  88 | >>>         autoconf \
  89 | >>>         automake \
  90 | >>>         bison \
  91 | >>>         bzip2 \
  92 | >>>         curl \
  93 | >>>         diffutils \
  94 | >>>         file \
  95 | >>>         git \
  96 | >>>         make \
  97 | >>>         patch \
  98 | >>>         perl \
  99 | >>>         unzip \
 100 | >>>         util-linux \
 101 | >>>         wget \
 102 | >>>         which \
 103 | >>>         xz \
 104 | >>>         yasm
 105 |     RUN yum install -y \
--------------------
ERROR: failed to solve: process "/bin/sh -c yum install -y         aclocal         autoconf         automake         bison         bzip2         curl         diffutils         file         git         make         patch         perl         unzip         util-linux         wget         which         xz         yasm" did not complete successfully: exit code: 1
Error: Process completed with exit code 1.

See: https://serverfault.com/questions/1161816/mirrorlist-centos-org-no-longer-resolve

atalman commented 4 days ago

Resolved with https://github.com/pytorch/builder/pull/1904