Open ferrarimarco opened 5 years ago
I'm really keen to support this too - we've got several cases internally that'd benefit from this so we've wanted it for a while, too!
It wouldn't be super hard to implement, but it's not a tiny project either. We'd need to parse shell expressions so we could handle things like RUN apt-get update && apt-get install -y foo=1.0.0
(far more complicated examples exist too...!), and we'd need to integrate with the various package registries, ideally detecting the distro and release by looking at the base image (recursively).
Unfortunately we won't have capacity to implement this in the near future. If you're really keen for support, we would accept a PR though :-)
The repo is this one, right? https://github.com/dependabot/dependabot-core
Yep!
This would be very helpful!
Would it be easier to code the update if we used files like pip and gem? For example an akp.txt file. Then use something like xargs to run apk add. That way they don't have to figure out how to parse the packages out of the Dockerfile.
I suppose the parser implementation complexity would be the same, but you'll have the overhead of having to load that file somehow when you build the docker image (since you have to install those packages, don't you?)
We ended up making a quick Python script to roll the pins: https://gist.github.com/mst-ableton/d0b80692571718fcb0a8f3984add9c03. As it uses Python it's not easily upstreamable, but the idea is to run apt-get update
inside the container and parse the output of apt-get upgrade -s
to see what it would have upgraded to. Because it's doing two docker builds
, it may take a while to run. Hope this effort can jumpstart a Dependabot-native implementation in the future.
Been bashing my head against the wall with this one for https://github.com/ironPeakServices/iron-redis At one side you want to pin your package versions, but the other way you can't keep maintaining the package versions manually or whenever there is a security fix.
I've taken a look at what would be involved to make this a reality - I almost started a standalone project to do it, but having it part of Dependabot feels more appropriate, plus there's a better code structure already.
Questions/thoughts for any dependabot-core maintainers (@feelepxyz @jurre @greysteil ?):
I can see value in reusing the Docker FileFetcher
, but having a separate package_manager
used for different base OS'es, a la:
docker_alpine
docker_ubuntu
docker_centos
package_manager
respectively? or subdirectories within top level docker
? maybe docker/lib/dependabot/docker_(alpine|ubuntu|centos)
?There might be some potential for shared/reusable logic in the various FileParser
's and FileUpdater
's, maybe even a single shared FileParser
/FileUpdater
, TBD.
I think each UpdateChecker
will likely be unique, to talk to the different package repositories respectively for each OS. Things like Ubuntu PPA's and the equivalents for other OS'es will be interesting to deal with also... as these cannot be 100% inferred from the contents of the dependency file (Dockerfile
) only?
I can see a potential need to actually "run" the Docker image with a command to trawl/read the likes of:
/etc/os-release
/etc/apt*
/etc/yum*
for "what package repositories need to be poked by the UpdateChecker
, is there a facility available to do that?At one side you want to pin your package versions, but the other way you can't keep maintaining the package versions manually or whenever there is a security fix.
@hazcod I think the core principal here from my standpoint, is for a Dockerfile
to produce a deterministic image output. It's tricky, but hard versioning at the OS package level goes a long way towards that working (with the exception of a Linux distribution pulling the rug out from under you and 404'ing the repo URLs for a specific OS release).
3. I can see a potential need to actually "run" the Docker image with a command to trawl/read the likes of:
/etc/os-release
/etc/apt*
/etc/yum*
for "what package repositories need to be poked by theUpdateChecker
, is there a facility available to do that?
https://github.com/dependabot/dependabot-core#setup
To run all of Dependabot Core, you'll need Ruby, Python, PHP, Elixir, Node, Go, Elm, and Rust installed.
No current provision to have Docker installed or accessible as part of the list of helpers specified?
I don't maintain Dependabot anymore, but you're in safe hands with @feelepxyz and @jurre. I know they've been swamped in the last few weeks, though, and may be taking some well deserved time off over Christmas.
Appreciate you looking into this @CpuID. I just want to preface this with a note that I'm not sure if we will be able to timely review, merge and support such a contribution at this time.
We've paused accepting new ecosystems, and this patch might be of similar proportions.
Having said that, I'll try to answer some of your questions:
how would you prefer the file hierarchy to look here? extra top level directories for each packagemanager respectively? or subdirectories within top level docker? maybe docker/lib/dependabot/docker(alpine|ubuntu|centos)
I imagine that the implementations will be relatively similar, and it feels like it should be part of the docker
package_manager.
What I imagine right now (without much context on this, so I may very well be wrong):
docker
package_managerdocker/lib/dependabot/docker/update_checkers/alpine_update_checker.rb
, docker/lib/dependabot/docker/update_checkers/ubuntu_update_checker.rb
etc. This will have to be integrated in the main UpdateChecker
FileUpdater
and subsequent steps, as long as it's aware of how to update an OS package and handle the shell parsing etc. It might make sense to pull this out into its own class that we call from the existing FileUpdater
.It's hard to say what it should look like exactly without doing some more investigation though, and I would definitely re-evaluate once we have a better idea of how many parts of the codebase we can reuse and how much we end up having to change.
@jurre thanks for the response :)
I think your suggestion for using docker/lib/dependabot/docker/update_checkers/alpine_update_checker.rb
etc makes sense, I'm happy with that filename hierarchy (depending on which class is sharded out respectively, TBD based on findings during implementation). Eg. could be docker/lib/dependabot/docker/file_parsers/alpine_file_parser.rb
.
I'll see if I get free cycles to put something together, and see how far I get.
We aim to provide the best user experience possible for each of these, but we have found we've lacked the capacity – and in some cases the in-house expertise – to support new ecosystems in the last year.
@jurre hiring? :)
@jurre hiring? :)
We are! https://boards.greenhouse.io/github/jobs/2383025 https://boards.greenhouse.io/github/jobs/2384868
If dependencies were stored in a JSON file similar to package.json, jq
and xargs
can be used to generate the install command and update the versions:
apt.json
{
"nginx": "1.18.0-0ubuntu1",
"openssl": "1.1.1f-1ubuntu2.4",
"ca-certificates": "20210119~20.04.1"
}
Run in Dockerfile:
jq -r 'to_entries | .[] | .key + "=" + .value' apt.json | xargs apt-get install -y
An action can read the version to update the JSON:
apt-cache policy nginx | grep -oP '(?<=Candidate:\s)(.+)'
Here's a working example.
A script updates the latest version of packages in the JSON file: https://github.com/wildpeaks/docker-nginx/blob/main/docker/update_dependencies.sh
#!/bin/bash
JSON=$( cat dependencies.json )
for PACKAGE in $( echo $JSON | jq -r 'keys | .[]' ); do
VERSION=$( apt-cache policy "$PACKAGE" | grep -oP '(?<=Candidate:\s)(.+)' )
JSON=$( echo $JSON | jq '.[$package] = $version' --arg package $PACKAGE --arg version $VERSION )
done
echo $JSON | python -m json.tool > dependencies.json
A cron Action runs the update script and creates a matching pull request: https://github.com/wildpeaks/docker-nginx/blob/main/.github/workflows/dependencies.yml
# ...
- name: Update dependencies
working-directory: docker
run: |
sudo apt-get update
sh update_dependencies.sh
- name: Create PR
uses: peter-evans/create-pull-request@v3
with:
commit-message: "chore(deps): update dependencies.json"
branch: features/update-dependencies
title: Update APT packages
body: Updated dependencies.json
delete-branch: true
And the Dockerfile uses the JSON file to install pinned versions: https://github.com/wildpeaks/docker-nginx/blob/main/docker/Dockerfile#L7
# ...
COPY dependencies.json /tmp/dependencies.json
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -y --no-install-recommends jq \
&& jq -r 'to_entries | .[] | .key + "=" + .value' /tmp/dependencies.json | xargs apt-get install -y --no-install-recommends \
&& rm /tmp/dependencies.json
# ...
Hello! I was wondering if there was any ongoing effort or plan to get this implemented. This feature would be a huge help!
This would be useful for me. I had an issue where the Ubuntu repositories gave me a very old version of a package. Ive started pinning my package versions, but now I have increased maintenance overhead.
Here's a working example.
A script updates the latest version of packages in the JSON file: https://github.com/wildpeaks/docker-nginx/blob/main/docker/update_dependencies.sh
#!/bin/bash JSON=$( cat dependencies.json ) for PACKAGE in $( echo $JSON | jq -r 'keys | .[]' ); do VERSION=$( apt-cache policy "$PACKAGE" | grep -oP '(?<=Candidate:\s)(.+)' ) JSON=$( echo $JSON | jq '.[$package] = $version' --arg package $PACKAGE --arg version $VERSION ) done echo $JSON | python -m json.tool > dependencies.json
A cron Action runs the update script and creates a matching pull request: https://github.com/wildpeaks/docker-nginx/blob/main/.github/workflows/dependencies.yml
# ... - name: Update dependencies working-directory: docker run: | sudo apt-get update sh update_dependencies.sh - name: Create PR uses: peter-evans/create-pull-request@v3 with: commit-message: "chore(deps): update dependencies.json" branch: features/update-dependencies title: Update APT packages body: Updated dependencies.json delete-branch: true
And the Dockerfile uses the JSON file to install pinned versions: https://github.com/wildpeaks/docker-nginx/blob/main/docker/Dockerfile#L7
# ... COPY dependencies.json /tmp/dependencies.json RUN DEBIAN_FRONTEND=noninteractive apt-get update \ && apt-get install -y --no-install-recommends jq \ && jq -r 'to_entries | .[] | .key + "=" + .value' /tmp/dependencies.json | xargs apt-get install -y --no-install-recommends \ && rm /tmp/dependencies.json # ...
That works fantastically! Thank you!
Did you ever figure out how to get it to work on non-Ubuntu? E.g. Alpine Docker builds? I saw that you have the repo docker-browser-sync without the dependency updates action.
That works fantastically! Thank you!
Glad it helps :)
Did you ever figure out how to get it to work on non-Ubuntu? E.g. Alpine Docker builds? I saw that you have the repo docker-browser-sync without the dependency updates action.
The browser-sync one didn't need it because it's a NPM dependency (so Dependabot is the one updating the JSON file).
As for Alpine, sorry I never tried but the main challenge would be to find an Alpine equivalent of the apt-cache policy
command whereas the rest should be similar (afaik jq
is also available on Alpine).
Actually there is a similar command to apt-cache policy
in Alpine.
It's possible to list the upgradable packages, with new and current versions with apk -u list
.
Update indexes first:
# apk update
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/community/x86_64/APKINDEX.tar.gz
v3.14.0-160-g18a21f8aa5 [http://dl-cdn.alpinelinux.org/alpine/v3.14/main]
v3.14.0-165-g01e8bc9b28 [http://dl-cdn.alpinelinux.org/alpine/v3.14/community]
OK: 15009 distinct packages available
Then we can list only upgradable packages:
# apk -u list
rsync-3.2.3-r4 x86_64 {rsync} (GPL-3.0-or-later) [upgradable from: rsync-3.2.3-r2]
rsync-doc-3.2.3-r4 x86_64 {rsync} (GPL-3.0-or-later) [upgradable from: rsync-doc-3.2.3-r2]
rsync-openrc-3.2.3-r4 x86_64 {rsync} (GPL-3.0-or-later) [upgradable from: rsync-openrc-3.2.3-r2]
krb5-libs-1.18.4-r0 x86_64 {krb5} (MIT) [upgradable from: krb5-libs-1.18.3-r1]
libcurl-7.78.0-r0 x86_64 {curl} (MIT) [upgradable from: libcurl-7.77.0-r1]
apk-tools-doc-2.12.6-r0 x86_64 {apk-tools} (GPL-2.0-only) [upgradable from: apk-tools-doc-2.12.5-r1]
apk-tools-2.12.6-r0 x86_64 {apk-tools} (GPL-2.0-only) [upgradable from: apk-tools-2.12.5-r1]
linux-virt-5.10.43-r0 x86_64 {linux-lts} (GPL-2.0) [upgradable from: linux-virt-5.4.84-r0]
curl-7.78.0-r0 x86_64 {curl} (MIT) [upgradable from: curl-7.77.0-r1]
curl-doc-7.78.0-r0 x86_64 {curl} (MIT) [upgradable from: curl-doc-7.77.0-r1]
I've never used jq
, so a bit of help would be greatly appreciated :stuck_out_tongue_winking_eye:
The convenient thing with apt-cache policy
is that it provides the version number without having to install the outdated packages first (unlike a list of upgradable packages).
I think this would be a closer equivalent: apk info "PACKAGENAME" | head -1 | cut -d ' ' -f 1
Any news?
This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.
The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.
Am I missing something?
This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.
The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.
Am I missing something?
The repos can contain a single version sometimes, but not always. You seem to be correct, at least for debian:latest and ubuntu:latest, but a quick check shows that this is not always the case.
debian:10 image
root@d71a1f0c3573:/# apt-cache madison systemd
systemd | 241-7~deb10u8 | http://deb.debian.org/debian buster/main amd64 Packages
systemd | 241-7~deb10u8 | http://security.debian.org/debian-security buster/updates/main amd64 Packages
ubuntu:20.04 image
root@edf9514d8882:/# apt-cache madison systemd
systemd | 245.4-4ubuntu3.16 | http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
systemd | 245.4-4ubuntu3.15 | http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages
systemd | 245.4-4ubuntu3 | http://archive.ubuntu.com/ubuntu focal/main amd64 Packages
This is just an example. I have observed the Ubuntu repos giving me a very old package version temporarily, causing one of my images to fail integration tests due to the incompatible package. This has happened only once. I solved this by pinning the version in my dockerfile, however, maintaining the dockerfile becomes difficult. The trade off is worth it for me, but maybe not for everyone.
Additionally, some Dockerfile linters will push you to pin package versions. If you allow the repo's to decide which package version you are using, you do loose some control of your image's end state.
I agree that it is generally not an issue, but it can be for some builds.
I am not happy with my solution, but it does work. It is based on some of the feedback in this thread. Basically, I use a "base image" that has the pinned packages that I depend on. This way, I end up building the base image infrequently while building my final image frequently. Dependabot would be a great addition to this workflow, preventing my base image from going stale.
Lastly, its possible that excellent integration testing of the final image would allow us to always use the latest base image with the latest packages, without relying on dependabot to handle things. Just depends how folks wish to do things.
Thanks, that essentially confirms my intuition. Even in the examples you provided, the different package versions are due to them being in different repos, but each repo only contains a single version.
In an ideal world, I think we'd all pin our Apt package versions, but that seems incompatible with the Debian / apt ecosystem which focuses more on preserving backwards compatibility and thus (theoretically) makes pinning unnecessary.
I'm sure there are other use cases for this feature, this is just mine 😄
This sounds great in theory, but for the vast majority of use cases it's probably a false hope. Why? Debian repos only maintain the latest version of a given package. Unless you are hosting your own package repo, you aren't going to be able to install arbitrary package versions. So the idea of committing a "dependencies.json" file to version control is essentially impossible, at least in the context of building Docker images.
The only exceptions I see are if you host your own package repo or rely on very careful Docker caching to retain an old "pinned" version of a package.
Am I missing something?
I think you are taking the problem by the wrong end.
Of course there are applications whose maintainers use version pinning to build software based on legacy dependencies. I don't think such people find much value in using dependabot, they know which version of each dependency they use and they know they can hardly upgrade without breaking everything. The target of Dependabot are software maintainers that want to efficiently keep their software up to date with the latest security fixes.
Take this Dockerfile as an example:
FROM debian:11.4-slim as minifier
RUN apt-get install --yes --no-install-recommends minify=2.7.2-1+b6
Imagine minify developers make a security fix and distribute a newer 2.7.3 version that makes its way into the debian repository. I don't get a Dependabot notification regarding the deprecation of 2.7.2. I either handle the upgrade manually, which is insane when you consider the number of projects times number of dependencies I have to monitor. Or you use some latest
-like constraint and build software periodically. This is so inefficient: most of the builds will result in no change compared to the previous build, and you can expect an half-period between the new version being available and deployed.
Thanks to Dependabot, whenever debian publishes the next version of their base image, I'll get a notification prompting me to upgrade my base image to debian:11.5-slim
. This allows me to immediately build a new image of my software, based on that new image, without spilling computing resources to rebuild my image daily / weekly for nothing.
I wish I had the same feature for my apt packages.
Thank you for so succinctly describing my precise use case, @ArwynFr. Yep, this is exactly how I want Dependabot to work.
Thanks to Dependabot, whenever debian publishes the next version of their base image, I'll get a notification prompting me to upgrade my base image to
debian:11.5-slim
. This allows me to immediately build a new image of my software, based on that new image, without spilling computing resources to rebuild my image daily / weekly for nothing.I wish I has the same feature for my apt packages.
It also helps us keep track of when specific packages were updated, helping troubleshooting be far faster.
While your docker support is great to keep the
FROM
directive updated, it could be enhanced by including support for the OS package managers (like APT for Debian and derivatives, APK for Alpine...).With this addition we could completely rely on dependabot to keep our Docker images updated, instead of having to keep that manually updated.
Example 1 (APT on Ubuntu):
Example 2 (APK on Alpine):
apk add packagename=1.2.3-suffix
We could even get fancy and support version constraints, like >=.