docker-library / buildpack-deps

MIT License
445 stars 113 forks source link

Automated builds intermittently failing on `apt-get install` due to apt cache issues #40

Closed callbacknull closed 6 years ago

callbacknull commented 8 years ago

So approximately one in every eight of my automated builds on docker hub will fail on trying to install packages with apt-get install. The error I get Error reading from server. Remote end closed connection [IP: 128.31.0.66 80] makes it seem like either docker hub is dropping the connections to debian's apt servers or the apt cache of the container is in a weird state.

Build logs: http://pastebin.com/FXxe6XH5 Source image: FROM node:4.2.4-slim Run line: RUN apt-get update && apt-get install -y python python-dev python-pip python-virtualenv && apt-get clean && rm -rf /var/lib/apt/lists/*

Yesterday I added a pre-install clean making that run line: RUN apt-get clean && apt-get update && apt-get install -y python python-dev python-pip python-virtualenv && apt-get clean && rm -rf /var/lib/apt/lists/* I am at 20 builds without issue so far after adding that to my Dockerfile.

I checked the node dockerfile, they don't install anything through apt-get so that leaves the build packs. Is there any reason to not add apt-get clean to the end of each apt-get install? PR is coming

tianon commented 8 years ago

GitHub's email -> comment system must be having a rough day, I sent this about 15 minutes ago: :disappointed:

https://github.com/docker/docker/blob/630a5a23c73276faefaedd0b639ce1525c2bdc24/contrib/mkimage/debootstrap#L82-L105

This should be happening automatically already. :confused:

tianon commented 8 years ago

If apt-get clean is now doing more than that block of code, we should update that block of code, not every image everywhere. :smile:

callbacknull commented 8 years ago

whoa... @tianon how does that run automatically? Is that wrapped around another apt-get command?

Updating that method with whatever new behavior apt-get clean has certainly would fix it... until the next time apt-get clean changes. Why not just use the default apt-get clean so you don't need to worry about updating a custom implementation. If it's working with docker, that will be more reliable that relying on a custom implementation.

tianon commented 8 years ago

That's run as a hook during internal bits of APT itself, and the reason we don't use apt-get clean directly is embedded directly in the comment in that file:

Ideally, these would just be invoking "apt-get clean", but in our testing, that ended up being cyclic and we got stuck on APT's lock, so we get this fun creation that's essentially just "apt-get clean".

(ie, invoking apt-get clean from within this hook is a no-go because APT is already holding its own lock)

tianon commented 8 years ago

Hmm:

$ docker run -it --name test buildpack-deps:jessie apt-get clean
$ docker diff test
$ 
tianon commented 8 years ago

Even sane on sid still:

$ docker run -it --name test buildpack-deps:sid apt-get clean
$ docker diff test
C /var
C /var/cache
C /var/cache/apt
C /var/cache/apt/archives
C /var/lib
C /var/lib/apt
C /var/lib/apt/lists
A /var/lib/apt/lists/lock
C /var/lib/apt/lists/partial
tianon commented 8 years ago
$ docker run -it --name test buildpack-deps:trusty apt-get clean
$ docker diff test
$ 
tianon commented 8 years ago
$ docker run -it --name test buildpack-deps:xenial apt-get clean
$ docker diff test
C /var
C /var/cache
C /var/cache/apt
C /var/cache/apt/archives
C /var/lib
C /var/lib/apt
C /var/lib/apt/lists
C /var/lib/apt/lists/lock
C /var/lib/apt/lists/partial
tianon commented 8 years ago

I'm thinking this is something bizarre happening with the mirrors/rotation (which I've seen a lot of in the past).

callbacknull commented 8 years ago

Gotcha that makes a lot sense now. Do the Debian mirrors rotate fairly often? I've never noticed how frequently that happens.

What about this solution? It should run the apt clean automatically with an install. http://askubuntu.com/a/389738

tianon commented 8 years ago

Ooh, very interesting -- that warrants more testing! :+1:

If you visit http://httpredir.debian.org/demo.html it should show you which mirrors it's auto-selecting for you (which are usually fine, but occationally a mirror has issues or gets a little out of sync which cases things like Hash Sum mismatch).

callbacknull commented 8 years ago

ooo neato - though it's probably different for my docker builds since those happen on docker hub :P

What do you think about maybe just disabling the apt caching entirely?

Dir::Cache::srcpkgcache "";
Dir::Cache::pkgcache "";

I've gotten this to work on my local system - not so much luck with the DSELECT::Clean "always" yet

EDIT: Wait scratch that..... I see that line is in there already.... hmm

md5 commented 8 years ago

@callbacknull That seems like it would break existing images. There many images out there that do RUN apt-get update and RUN apt-get install -y some-package in separate lines (or separate Dockerfiles).

callbacknull commented 8 years ago

@md5 take a look at the link in the first comment. What I posted is already being done by docker as a dpkg hook.

There really that many? I haven't ran across any of those ones yet then. It is advised against fairly strongly in docker's best practices docs. Is there an actual use case for separating them as two run commands?

md5 commented 8 years ago

@callbacknull I'm aware of that code in debootstrap, but I hadn't looked at it in a while. I see now that it already has the lines you were suggesting. That being said, it looks like even that functionality may break at some point: https://github.com/tianon/docker-brew-debian/issues/27

Also, I was simply confused regarding image breakage. For some reason I was thinking that the changes you suggested would make it necessary to run apt-get update before every apt-get install -y and that the effects of apt-get update would be limited to an image layer, but there's no reason that would be the case.

As for the number of images that have separate RUN lines for apt-get update and apt-get install -y ..., I've seen quite a few. While it isn't a best practice, the most common "use case" I've seen is where someone copies and pastes a set of software installation instructions and prepends each line with RUN.

callbacknull commented 8 years ago

Gotchya well unfortunately having those cache lines still doesn't fix the intermittent build issue we're seeing.

Okay yeah that's what I kind of what I figured would happen. But you know those people are doing it wrong.... and they'll probably break their builds when and if they add packages to their apt-get install I think it'd almost be better that if builds like that failed on first attempt as opposed to failing when they add a package later would be better. But I'm also just a big fan of failing fast.

Also, this just happened again in the continuous build for PR #42.

callbacknull commented 8 years ago

And it happened again

yosifkit commented 8 years ago

We've had similar failures on travis builds but they usually go away when we restart the build. It is not a problem with the image but a intermittent network or apt mirror issue.

callbacknull commented 8 years ago

@yosifkit I found I was having at least 1 in 8 builds failing. Then I changed the first line of installation to RUN apt-get clean && apt-get update && apt-get install -y now that results in about 1 in ~24 builds failing. That's a pretty big impact for my self.

An apt-get clean at the end does help mediate this. When I had the PR open earlier none of the builds failed. This may just be a non-issue for many people if they're doing their docker builds using a nice CI system that might auto-rebuild on failure. I'm stuck on docker hub for the moment so having the 1 in 8 builds failing was a bit of a headache. So this very well could become a non-issue for me in the next month or so when I switch over to a nice CI service.

ConorMcGee commented 8 years ago

I'm getting this with every build on a project.

alandotcom commented 8 years ago

I'm getting this with every build on a project.

I am seeing this occur on nearly every build, too.

bryce-admin commented 8 years ago

Have you tried this?

http://mmbash.de/blog/failed-fetch-with-docker-build-and-apt-get-update/

callbacknull commented 8 years ago

@mcgeeco @lumberj If you guys are looking for a hack in the mean time to reduce this issue then prepend your RUN lines that executes an apt-get install with an apt-get clean. I added this to my company's dockerfiles around 4 months ago. Just taking a look at our builds now, over the last 3 months for our Node API container we've had 93 Successful builds with only 3 builds failing due to the apt caching issue.

@bryce-admin I have a feeling that wouldn't work. The cache that's the problem isn't the one that originates for the image you are building but from the image we are all sourcing from (buildpack-deps). Also, if you use dockerhub to perform builds for you then you have zero control over the options that are passed to the docker build command.

alandotcom commented 8 years ago

@callbacknull I tried that and it doesn't help

alandotcom commented 8 years ago

@bryce-admin that worked

alandotcom commented 8 years ago

@bryce-admin actually nvm, just saw a failure

alandotcom commented 8 years ago

For what it's worth, it always fails on this same dependency

docker run ...

Step 1 : FROM node:4-slim
 ---> d1e5c6d53894
Step 2 : RUN apt-get clean &&   apt-get update &&   apt-get -y install   sqlite3   libsqlite3-dev   build-essential   python   git   libaio1   unzip &&   apt-get clean && rm -rf /var/lib/apt/lists/*
 ---> Running in ba89639fdb26
E: Failed to fetch http://httpredir.debian.org/debian/pool/main/r/rsync/rsync_3.1.1-3_amd64.deb  Error reading from server. Remote end closed connection [IP: 128.31.0.66 80]
callbacknull commented 8 years ago

@lumberj Interesting - looking more closely at our failed builds the package that apt fails on for us with this issue has always been a different package.

If you're interested, you can try out the APT run line that I use in our node dockerfile:

ENV DEPENDENCY_PACKAGES=""
ENV BUILD_PACKAGES="python python-dev python-pip python-virtualenv"
RUN apt-get clean && \
    apt-get update && \
    apt-get install -y $DEPENDENCY_PACKAGES $BUILD_PACKAGES && \
    npm install && \
    apt-get purge -y $BUILD_PACKAGES && \
    apt-get autoremove -y && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

This does the whole install build deps, npm install, remove build deps as one line. Just populate the two env variables with your appropriate packages.

tianon commented 8 years ago

I wonder if these failures are related to https://github.com/tianon/docker-brew-debian/issues/37 :disappointed:

alandotcom commented 8 years ago

@callbacknull I'll give that a shot

@tianon I wonder if specifying a mirror might help, e.g.,

 RUN sed -i  "s/http:\/\/httpredir\.debian\.org\/debian/foo/g" /etc/apt/sources.list
alandotcom commented 8 years ago

@callbacknull it didn't help. I'm going to try setting a specific mirror in sources.list

alandotcom commented 8 years ago

@callbacknull Testing this right now

RUN echo \
   'deb ftp://ftp.us.debian.org/debian/ jessie main\n \
    deb ftp://ftp.us.debian.org/debian/ jessie-updates main\n \
    deb http://security.debian.org jessie/updates main\n' \
    > /etc/apt/sources.list
alandotcom commented 8 years ago

https://github.com/docker-library/buildpack-deps/issues/40#issuecomment-215565229 seemed to work

wglambert commented 6 years ago

Looks like the issue is resolved and having been a few years since last comment I'm going to prune the issue.