actions / runner-images

GitHub Actions runner images
MIT License
9.17k stars 2.84k forks source link

Unable to connect to azure.archive.ubuntu.com flaky failure on ubuntu-latest #675

Closed traversaro closed 4 years ago

traversaro commented 4 years ago

Describe the bug On some jobs (apparently in a non-deterministic way) commands such as apt-get install fail with the following error:

Get:1 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu bionic/main amd64 gdb amd64 8.2-0ubuntu1~18.04 [3024 kB]
Get:2 http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu bionic/main amd64 gdbserver amd64 8.2-0ubuntu1~18.04 [292 kB]
Get:3 http://ppa.launchpad.net/ondrej/php/ubuntu bionic/main amd64 libxml2-dev amd64 2.9.10+dfsg-2+ubuntu18.04.1+deb.sury.org+1 [821 kB]
Get:4 http://ppa.launchpad.net/ondrej/php/ubuntu bionic/main amd64 libxml2 amd64 2.9.10+dfsg-2+ubuntu18.04.1+deb.sury.org+1 [726 kB]
Get:5 http://ppa.launchpad.net/ondrej/php/ubuntu bionic/main amd64 libgraphicsmagick-q16-3 amd64 1.3.30+hg15796-1+ubuntu18.04.1+deb.sury.org+2 [1181 kB]
Get:6 http://ppa.launchpad.net/ondrej/php/ubuntu bionic/main amd64 libgraphicsmagick++-q16-12 amd64 1.3.30+hg15796-1+ubuntu18.04.1+deb.sury.org+2 [144 kB]
Err:7 http://azure.archive.ubuntu.com/ubuntu bionic/main amd64 libdouble-conversion1 amd64 2.0.1-4ubuntu1
  Could not connect to azure.archive.ubuntu.com:80 (52.177.174.250), connection timed out
Ign:8 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libqt5core5a amd64 5.9.5+dfsg-0ubuntu2.5
Ign:9 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libwayland-server0 amd64 1.16.0-1ubuntu1.1~18.04.3
Err:10 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgbm1 amd64 19.2.8-0ubuntu0~18.04.3
  Unable to connect to azure.archive.ubuntu.com:http:
Err:11 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb-xfixes0 amd64 1.13-2~ubuntu18.04
  Unable to connect to azure.archive.ubuntu.com:http:
Err:12 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libegl-mesa0 amd64 19.2.8-0ubuntu0~18.04.3
  Unable to connect to azure.archive.ubuntu.com:http:
Err:13 http://azure.archive.ubuntu.com/ubuntu bionic-updates/main amd64 libegl1 amd64 1.0.0-2ubuntu2.3

For an example of such failure, check https://github.com/robotology/idyntree/pull/668/checks?check_run_id=560084383 . I would not be too surprised about CI jobs failing for networks problems, but this specific issue seems to be extremely frequent. I did not collect precise data on how much frequently this happens, but my impression that it happens in the order of magnitude of 1 out of 10 builds.

This issue is already being discussed on GitHub Community Forum at https://github.community/t5/GitHub-Actions/Install-dependencies-in-Ubuntu-flakes-several-times-a-wekk/td-p/51785 , but the GitHub support suggested me to open an issue here as well.

Area for Triage:

Question, Bug, or Feature?: Bug

Virtual environments affected

Expected behavior I would expect commands such as apt-get install to work correctly.

Actual behavior It is not clear what is triggering this behavior, but sometimes apt-get install fails.

MSP-Greg commented 4 years ago

I added -o Acquire::Retries=3, and today it worked, then failed for a while, then starting working again.

It may not be a GitHub issue, but they (and especially MSFT) have the resources to mirror azure.archive.ubuntu.com so Actions CI is reliable...

boegel commented 4 years ago

We've been seeing issues like this fairly frequently since Sunday March 29th...

boegel commented 4 years ago

Using -o Acquire::Retries=3 is not helping for us, see for example https://github.com/easybuilders/easybuild-easyblocks/pull/2012/checks?check_run_id=563883264

MCOfficer commented 4 years ago

I mentioned this on the github community post, but you can probably use apt-spy2 to find a working mirror before using apt.

MSP-Greg commented 4 years ago

@boegel

Sorry, I didn't mean to imply that using -o Acquire::Retries=3 was a fix. What I meant was that if one is using it and apt-get still times out, there is really a problem.

boegel commented 4 years ago

I mentioned this on the github community post, but you can probably use apt-spy2 to find a working mirror before using apt.

That adds a couple of minutes to each run, but it seems to work well to circumvent the problems, thanks a lot!

For other, see for example https://github.com/easybuilders/easybuild-easyconfigs/pull/10341

znarf commented 4 years ago

On our side, we get 503 errors on almost all runs past few hours:

E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/universe/f/fonts-arphic-bkai00mp/fonts-arphic-bkai00mp_2.10-17_all.deb  503  Service Unavailable [IP: 52.177.174.250 80]
deivid-rodriguez commented 4 years ago

I added this to my workflow before apt-get install as a workaround:

sudo sed -i 's/azure\.//' /etc/apt/sources.list
sudo apt-get update
paoloczi commented 4 years ago

Several workarounds proposed, but no comments about fixing this for real. This is a serious issue making GitHub Actions unusable for anybody working with Ubuntu, and should be looked at with high priority.

alepauly commented 4 years ago

@traversaro (and everyone else affected) - We've engaged the mirror hosts and they have added capacity to improve its reliability. I'll keep this open until we can confirm the issue is resolved. Please let us know if this continues to occur. Thank you for the reports.

MSP-Greg commented 4 years ago

@alepauly

Thank you.

Last saw a problem at 23:15 GMT (3 hours ago), at which point I changed servers. Did the update/change happen after that?

EDIT: 02:25 GMT - reverted server change, and no timeout errors.

alepauly commented 4 years ago

@MSP-Greg, I don't have he exact time where the fixes were put in place yet but I believe it was in the last two hours. Please let me know if you see more issues, thanks!

traversaro commented 4 years ago

@traversaro (and everyone else affected) - We've engaged the mirror hosts and they have added capacity to improve its reliability. I'll keep this open until we can confirm the issue is resolved. Please let us know if this continues to occur. Thank you for the reports.

Thanks a lot @alepauly !

hloeung commented 4 years ago

@alepauly has something changed with GitHub's Actions on the 24th March?

alepauly commented 4 years ago

@alepauly has something changed with GitHub's Actions on the 24th March?

@hloeung we make improvements constantly so, yes. If you're having a specific problem with the tools installed please file an issue in this repo so we can help.

pestophagous commented 4 years ago

Noting another workaround, since there are many interested parties on this thread:

https://github.com/ros2/rmw_cyclonedds/pull/134/files

I'm not the author of that, but I found it a few days ago when I saw this azure-related outage was happening to me. I took the same approach as the ros2/rmw_cyclonedds team, and I've been quite happy with it.

Based on search results I found on April 1 (before this present issue #675 had been created), this is a recurring problem: https://github.community/t5/GitHub-Actions/sudo-apt-install-fails-with-Unable-to-connect-to-azure-archive/td-p/32154

hloeung commented 4 years ago

@alepauly how can I get in touch with you? Would like to discuss something without having to file an issue.

boegel commented 4 years ago

We changed to using apt-spy2 a couple of days ago which helped for a while, but today tests starting failing again because of what looks like a faulty mirror picked by apt-spy2:

E: Failed to fetch http://mirrors.codec-cluster.org/ubuntu/pool/universe/l/lua-bit32/lua-bit32_5.3.0-3_amd64.deb  403  Forbidden [IP: 65.49.71.107 80]

So, we've disabled the use of apt-spy2 again, based on @alepauly's comment that the issues with azure.archive.ubuntu.com should be largely resolved now... (see https://github.com/easybuilders/easybuild-easyconfigs/pull/10374)

traversaro commented 4 years ago

It is not an extensive test, but personally I did not have any more azure.archive.ubuntu.com-related failures in the last few days, thanks @alepauly .

deivid-rodriguez commented 4 years ago

Neither did us, everything seems fine now :+1:.

miketimofeev commented 4 years ago

@traversaro @deivid-rodriguez thanks for the updates! We're going to close the issue, but feel free to open a new one if there are any similar issues in the future.

akhilerm commented 3 years ago

Hitting the same issue in ubuntu bionic. when running apt-get install

E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/s/systemd/libudev-dev_237-3ubuntu10.41_amd64.deb  404  Not Found [IP: 52.177.174.250 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
MCOfficer commented 3 years ago

Not the same issue - the mirror is reachable, but doesn't have the file you request. Did you run apt-get update?

akhilerm commented 3 years ago

@MCOfficer Was the package recently removed from azure.ubuntu.com ? Because it was working till yesterday.

Yep. its working after doing apt-get update

MCOfficer commented 3 years ago

@MCOfficer Was the package recently removed from azure.ubuntu.com ? Because it was working till yesterday.

When a package is updated, the old version may be removed from the mirror. If you get 404's, 99% of the time apt-get update fixes it by fetching the mirror's new index.

jessevdp commented 3 years ago

I can confirm that running apt-get update before any calls to apt-get install on GitHub Actions seems to resolve this issue. We started experiencing this problem just today.

username1565 commented 2 years ago

I see this issue, about azure-pipeline.

I see the following pathway: http://azure.archive.ubuntu.com/ubuntu/ubuntu/ /ubuntu/ubuntu/. Maybe you can fix this, on the server-side, or help to fix. Because I do not know how to contact admin of that server.

hloeung commented 2 years ago

I see this issue, about azure-pipeline.

I see the following pathway: http://azure.archive.ubuntu.com/ubuntu/ubuntu/ /ubuntu/ubuntu/. Maybe you can fix this, on the server-side, or help to fix. Because I do not know how to contact admin of that server.

What is the issue here? The /ubuntu symlink to . is there by design for those hosting community mirrors where the ubuntu mirror isn't exactly under / or /ubuntu. It's also the same on the main Ubuntu archive mirrors, e.g. http://archive.ubuntu.com/ubuntu/ubuntu/ubuntu/ubuntu/

XVilka commented 2 years ago

Still happens: https://github.com/rizinorg/rz-ghidra/runs/3398556746?check_suite_focus=true#step:6:110

Current runner version: '2.280.3'
Operating System
  Ubuntu
  20.04.2
  LTS
...
Run sudo apt-get install ninja-build libgraphviz-dev bison flex qt5-default
...
Fetched 18.6 MB in 0s (61.0 MB/s)
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/m/mesa/libegl-mesa0_21.0.3-0ubuntu0.2~20.04.1_amd64.deb  404  Not Found [IP: 52.252.75.106 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
catthehacker commented 2 years ago

@XVilka you need to run apt-get update before apt-get install

Cireo commented 2 years ago

This is still happening often enough to notice, at this very moment, last week, in January, and late last year.

Stratus3D commented 2 years ago

Still happening - https://github.com/asdf-vm/asdf-erlang/pull/245

catthehacker commented 2 years ago

@Stratus3D you need to run apt-get update before apt-get install (it's noted here: https://docs.github.com/en/actions/using-github-hosted-runners/customizing-github-hosted-runners#installing-software-on-ubuntu-runners)

can't believe I said same exact thing last year https://github.com/actions/virtual-environments/issues/675#issuecomment-903630257

hloeung commented 2 years ago

Maybe GitHub hosted runners should run or have "apt-get update" built into ubuntu-latest.

Cireo commented 2 years ago

@catthehacker it is true you need to run update before install, and some people may see similar symptoms by failing to do so, but this is a real bug that occurs with steps that are nothing more than

sudo apt update
sudo apt -y install <package>

the timing window can be exacerbated if the script has multiple apt install commands in a row, if some of the original installs take a minute or two.

catthehacker commented 2 years ago

but this is a real bug that occurs with steps that are nothing more than

yes? If you have issues with connection timing out then it is an issue described here. My comment was for the person that thought they had same issue. I have no idea what do you mean and why I was pinged.

the timing window can be exacerbated if the script has multiple apt install commands in a row

specify multiple packages in a single apt-get install command then

muralidar44 commented 1 year ago

i still face same issue, sudo apt-get update itself doesnt work

neilyoung commented 1 year ago
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/libgstreamer-plugins-base1.0-0_1.16.3-0ubuntu1_amd64.deb  404  Not Found [IP: 20.106.104.242 80]
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/gstreamer1.0-plugins-base_1.16.3-0ubuntu1_amd64.deb  404  Not Found [IP: 20.106.104.242 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Error: Process completed with exit code 100.
MCOfficer commented 1 year ago
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/libgstreamer-plugins-base1.0-0_1.16.3-0ubuntu1_amd64.deb[](http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/libgstreamer-plugins-base1.0-0_1.16.3-0ubuntu1_amd64.deb)  404  Not Found [IP: 20.106.104.242 80]
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/gstreamer1.0-plugins-base_1.16.3-0ubuntu1_amd64.deb[](http://azure.archive.ubuntu.com/ubuntu/pool/main/g/gst-plugins-base1.0/gstreamer1.0-plugins-base_1.16.3-0ubuntu1_amd64.deb)  404  Not Found [IP: 20.106.104.242 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Error: Process completed with exit code 100.

Looks like you too forgot to apt-get update, the correct version would've been 1.16.3-0ubuntu1.1.

neilyoung commented 1 year ago

Thanks. I added this right now to my workflow script:

    steps:
    - name: Update system (Ubuntu)
      if: matrix.os == 'ubuntu-20.04' || matrix.os == 'ubuntu-18.04'
      run: sudo apt update && sudo apt upgrade -y

Let's hope :)

neilyoung commented 1 year ago

@MCOfficer Yepp. That looks good. Thanks for the quick response

maciejlibraryx commented 1 year ago

happening now...

int-72h commented 1 year ago

I'm also running into this issue now - I don't know if it's a problem with my runner config or a github-wide problem. Earlier they'd just hang on the deps stage, now they fail in the same way as this issue describes.

Cireo commented 1 year ago

Still going on, 4 out of 4 failures since yesterday

frenkel commented 1 year ago

Same problem here in the last two days. About 4 out of every 5 builds fail for me due to this.

503 Service Unavailable [IP: 52.252.75.106 80]

douglas-raillard-arm commented 1 year ago

Same here, lots of jobs failed like this one: https://github.com/ARM-software/lisa/actions/runs/3884968922/jobs/6628229248

traversaro commented 1 year ago

@douglas-raillard-arm @frenkel @Cireo in my experience it is easier to track a problem that appeared again with a new issue, see https://github.com/actions/runner-images/issues/6894 for a new issue tracking this problem.

sehz commented 1 year ago

Happening again

sh-TU commented 1 year ago

Got the same problem.

sehz commented 1 year ago

I tried all the fixes mentioned here and in other sources. None of that worked including going back to ubuntu-18