mozilla-services / Dockerflow

Cloud Services Dockerflow specification
Apache License 2.0
199 stars 28 forks source link

/app/version.json in reproducible docker images #37

Closed garbas closed 4 years ago

garbas commented 6 years ago

in mozilla-releng/services we are trying to become "dockerflow complient" (https://github.com/mozilla-releng/services/pull/1183) either we run services on dockerhub or elsewhere. Standards are good! :) We are not there yet but we are getting there.

Our docker images are build reproducible and the future we would like to continuously check for binary reproducibility.

The requirement JSON version object breaks the promise of reproducibility since you don't know at build time if the artifact is going to be released.

Sure we can rebuild docker image just before pushing dockerhub, but then this is not really the same artifact as we tested on staging environment.

I'm opening this ticket to get some advice how to proceed. In the worse case we can rebuild the docker image before pushing, but maybe there is a way to have this JSON version object be outside the docker image as a metadata of the running container.

Thank you in advance, Rok.

mostlygeek commented 6 years ago

Dockerflow does not have requirements for reproducibility. It is standard for packaging software in a consistent way so we can reuse deployment tooling and pipelines.

milescrabill commented 6 years ago

I agree that we should strive for reproducible builds and binary reproducibility.

The requirement JSON version object breaks the promise of reproducibility since you don't know at build time if the artifact is going to be released.

I'm confused about what you mean here. At build time, you can determine whether the build was for a git tag (version) or not. If you build in CI, you can get something to specify for build from the environment or by a canonical URI format. You can use the built images wherever you please.

My impression is that source, version, and commit should not be a problem - these are all properties of the code repository at build time. build is slightly problematic because it implies that each build should have a different build value specified in version.json.

Cloudops has tooling that uses version.json to verify that an image was built from a trusted source, i.e. a CircleCI build for the project, which publishes build artifacts that can be verified at deploy time. We use build for that purpose.

What about having a version.json in your docker images is stopping you from using the same images in staging and production?

garbas commented 6 years ago

@mostlygeek i kinda understood that as well since there is no mention of reproducibility anywhere. but i also understand that nothing is set in stone if that the only certain thing is change :)

@milescrabill in our case source, version and commit are not properties of the build. a simple example would be that if I change a README.rst (or any other file which is not part of deployment) then that changes the commit and a rebuild is needed although our artifact for deployment didn't change, just the "metadata" (i hope you don't mind me calling version.json metadata). another example would be if you have multiple projects in one github repository, like we have in https://github.com/mozilla-releng/services.

I see the value of being able to verify where source is coming from (it is of huge importance, we should not drop/remove this feature). What I'm asking for reconsideration and advice is the location of metadata (version.json). While I agree this metadata is needed, I don't think it should be part of the deployment artifact, it should be able to know the metadata via some way for some deployment artifact, but not being part of it. that is only my opinion since noticed it

Also please know that this is not something that is blocking us, we can always hack around it. But since tools that we use usually shape the view of our world around us, you might find it useful to know what the usage of Nix exposed. Maybe consider a spec change or just a note in future discussions.

I would love to discuss more about this if possible, especially in person (eg. All Hands) if you think we both would benefit from it, I know I would.

mostlygeek commented 6 years ago

... While I agree this metadata is needed, I don't think it should be part of the deployment artifact, it should be able to know the metadata via some way for some deployment artifact, but not being part of it.

The __version__ HTTP endpoint returns the version.json data. Baking it in means no external metadata DB.

The Dockerflow design is about the reusability of containers.

What are the goals / desires for reproducibility?

sciurus commented 4 years ago

Closing due to inactivity.