docker-library / official-images

Primary source of truth for the Docker "Official Images" program
https://hub.docker.com/u/library
Apache License 2.0
6.46k stars 2.35k forks source link

Promotion of Fedora base image built in our build system "as is" #527

Open vpavlin opened 9 years ago

vpavlin commented 9 years ago

Hi,

Problem: We build Fedora base images in our Koji build system [1]. They are not only rootfs but complete (i.e. load-able) Docker images [2]. To get them to Docker Hub registry as official images, we need to extract layer.tar, upload it to github and then use Dockerfile with Automated builds [3]. This means we lose all metadata added during build - mainly the image ID at the moment, could be f.e. labels in the future if the PR gets merged.

It also requires non-trivial amount of work (where most of it could be probably automated, TBH) to get from "hey, there is new image in Koji" to "hey, there is new image in Docker Hub", which is mostly done by @lsm5 at the moment.

Proposed solution: One solution we thought about at DockerCon Europe could be that we would push our image under fedora/* repository and send a link to Koji build to you, guys. You could then review that those 2 match and promote the pushed image between official images - would this work? What are the challenges you and us would have to deal with?

Is there any other solution you could think of?

Thanks, Vašek

[1] http://koji.fedoraproject.org/koji/ [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9038227 [3] https://github.com/fedora-cloud/docker-brew-fedora

tiborvass commented 9 years ago

Hmmm, @tianon ?

tianon commented 9 years ago

I'm definitely interested in improving this, especially with the future possibility of having signed artifacts stay signed all the way through.

One major problem I see with accepting these kinds of tarballs directly is that review gets much harder since we're then reviewing a discrete tarball instead of a source repository (which is more manageable when it's just the rootfs, since problems there are usually obvious). Problems like what we discussed on the changes in #497 would have a much higher potential for slipping through since they'd be baked inside said tarball, and then they'd be causing the issues we discussed there out in the wild instead of being caught earlier (which is the whole point of the review process).

vpavlin commented 9 years ago

@tianon I don't think this is an issue. We would definitely provide all the sources - ks file is already available in Koji and we could make metadate also available if that's your requirement - i.e. the review process could stay, just would happen after build.

tianon commented 9 years ago

It's more the logistical problem of reviewing/verifying raw binary artifacts than much of anything else. Even if the source is available, we'd need a way to easily verify that the artifacts actually match the source (ie right now, we only review the source because we then build the artifacts from that source, so we have the implicit guarantee that the two match).

As the process stands right now, we can do 95+% of our review of most PRs from the GitHub UI directly, never touching actual artifacts or even the "git" command line except for the build test, which is all scripted. I'm trying to make sure we don't solve this in a way that only works in this one case, where we can be reasonably assured that you guys won't try anything nefarious -- in the future, this pattern may grow, and that assumption comes back to haunt us and make our review process heavier (or we just have to flat out tell folks that try "sorry, I can't trust you not to mislead us about what's in this image").

I'd reiterate that I do want to solve this -- I just want to make sure it's generalized.

vpavlin commented 9 years ago

I understand that you don't want to add manual work to workflow. What I don't really understand is the statement

"It's more the logistical problem of reviewing/verifying raw binary artifacts than much of anything else."

AFAIK the Fedora base image is (from your POV) a Dockerfile + tar.gz of the rootfs. Do you review the rootfs? If yes, I'm pretty sure we could make it much easier for you with Koji. If not, I don't see a change here. IMO you already have to trust us that we don't provide you anything nefarious in the tarred rootfs.

Am I missing something here? We could add whatever you need to the PR to make it easier for you (Dockerfile, KS file, list of rpms in the image, hashes...)

Anyway, thanks for considering this:)

tianon commented 9 years ago

I mean reviewing the discrete layers of the image, not just the final product -- we don't review the rootfs directly, but we do ensure that it can run some basic scripts via our test suite, which is as much as we're worried about with a rootfs tarball (since in our experience, that's a good enough indicator that the rootfs is created correctly, and this does catch the majority of rootfs creation bugs).

So, even with the source available, something that would be important for us is an easy way to verify that the binary artifact we receive actually does come from the source code provided, and I don't see an easy way to do that without replicating your whole build process from Koji and hoping it's reproducible, right? Maybe @vbatts (since he's crazy enough to be thinking about this kind of stuff) has some fun ideas for extracting the basic Dockerfile instructions out of a "docker save" tarball and making sure that the layer data actually matches, which would help.

Again, right now, this is easy because we create the artifact directly from the "source" (the Dockerfile and metadata being our main concern), so this part is already verified.

tianon commented 9 years ago

In talking about this more with @vbatts on IRC (now that the stale fedora image has managed to capture his attention and make him notice I poked him in this thread :smiling_imp:), I realize that for a base image like this, the important parts for us to verify are mostly the metadata on each layer, and that the number of layers stays reasonable (ie, adding the rootfs and maybe setting a default command to an interactive shell, and that environment variables are sane). These things are easy to verify by reading the repositories file and the individual .json files for each layer. I think I'll try mocking something up today to see how feasible this is.

(Meanwhile, @vbatts is working on how we can get an updated fedora image right now.)

vbatts commented 9 years ago

for the sake of this conversation, i've just pushed out a tool to do this inspection. Install it like go get github.com/vbatts/docker-utils/cmd/docker-save-dockerfile, read more info here https://github.com/vbatts/docker-utils#docker-save-dockerfile

tianon commented 9 years ago

Lol, using https://github.com/vbatts/docker-utils/pull/8:

root@f558daf3dfd2:/go/src/github.com/vbatts/docker-utils# curl -fsSL 'http://ftp.usf.edu/pub/fedora/linux/releases/test/22_Beta/Docker/x86_64/Fedora-Docker-Base-22_Beta-20150415.x86_64.tar.xz' | xz -d | docker-save-dockerfile
INFO[0000] using stdin ...                              
INFO[0045] Wrote {"Fedora-Docker-Base-22_Beta-20150415.x86_64" "latest" "cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392"} to "/tmp/docker-save-dockerfile.849914116/Dockerfile.cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392.044579475" 
root@f558daf3dfd2:/go/src/github.com/vbatts/docker-utils# cat /tmp/docker-save-dockerfile.849914116/Dockerfile.cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392.044579475
## RECREATED FROM IMAGE ON 2015-05-14T22:17:57Z
## Fedora-Docker-Base-22_Beta-20150415.x86_64:latest (cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392)

# Created: 2015-04-16 04:48:04 +0000 UTC; ID: cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392; Comment: "Created by Image Factory"
FROM cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392
vbatts commented 9 years ago

_very_useful* On May 14, 2015 6:20 PM, "Tianon Gravi" notifications@github.com wrote:

Lol, using vbatts/docker-utils#8 https://github.com/vbatts/docker-utils/pull/8:

root@f558daf3dfd2:/go/src/github.com/vbatts/docker-utils# curl -fsSL 'http://ftp.usf.edu/pub/fedora/linux/releases/test/22_Beta/Docker/x86_64/Fedora-Docker-Base-22_Beta-20150415.x86_64.tar.xz' | xz -d | docker-save-dockerfileINFO[0000] using stdin ... INFO[0045] Wrote {"Fedora-Docker-Base-22_Beta-20150415.x86_64" "latest" "cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392"} to "/tmp/docker-save-dockerfile.849914116/Dockerfile.cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392.044579475" root@f558daf3dfd2:/go/src/github.com/vbatts/docker-utils# cat /tmp/docker-save-dockerfile.849914116/Dockerfile.cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392.044579475## RECREATED FROM IMAGE ON 2015-05-14T22:17:57Z## Fedora-Docker-Base-22_Beta-20150415.x86_64:latest (cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392)

Created: 2015-04-16 04:48:04 +0000 UTC; ID: cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392; Comment: "Created by Image Factory"FROM cf2be2d9b10445e2a829b418d864a311e40ec37ee113bea16e3c2faeebde6392

— Reply to this email directly or view it on GitHub https://github.com/docker-library/official-images/issues/527#issuecomment-102186427 .

Vogtinator commented 6 years ago

I'll try resurrect this discussion now. This issue is the main reason the opensuse images in the library are no longer the primary source.

If you can provide a way to submit built images/layers to the library in a fully automatable way (preferably not GitHub PRs) it would be amazing and we would mirror opensuse/leap and maybe tumbleweed again here.

tianon commented 5 months ago

Way too many years later, I've finally got some good news to report! We have a new(ish?) oci-import builder type that is finally working well enough to be usable for this use case, if you're interested! Using it, you'll be able to ensure that the arch-specific manifest digests of fedora are identical to the images published elsewhere. :bow:

(for implementing, I think https://github.com/docker-library/bashbrew/pull/61 is probably the most helpful)

For reference, both ubuntu and busybox are now using this, if you want some concrete example implementations. :+1:

I'm happy to help with whatever implementation details / discussion is necessary. :heart:

vbatts commented 5 months ago

lololol 💜💜💜 myself and vasek have moved on from the roles when this came up. Maybe @cgwalters can direct it to who build fedora images now?

tianon commented 5 months ago

Oh, my bad - I think that's @cverna and @siddharthvipul now :bow:

cverna commented 5 months ago

Cool, I ll take a look at this, thanks for the ping.