osrf / docker_images

A repository to hold definitions of docker images maintained by OSRF
Apache License 2.0
558 stars 170 forks source link

Preserve snapshot version of images? #197

Open 130s opened 5 years ago

130s commented 5 years ago

Has there been any demand/plan to preserve "snapshot" images, e.g. images taken at the certain period that will basically never be updated?

This would allow us to take a control over the version of all packages on the container.

I believe those who build the products upon Docker images somehow maintain the version of the packages inside the image and/or the base image version.

On Ubuntu's hub there are such images. E.g. xenial-20180808.

(CC @AustinDeric, ros-industrial/docker repo maintainer.)

sloretz commented 5 years ago

I'm not sure if there has been discussion about adding tags with a date in the name. I would recommend tagging the image that should remain frozen instead.

sudo docker pull osrf/ros2:bouncy-core
sudo docker tag osrf/ros2:bouncy-core yourdockerhubusername/ros2:bouncy-core-20181008
sudo docker push yourdockerhubusername/ros2:bouncy-core-20181008
130s commented 5 years ago

@sloretz Yes, actually in my team we're already tagging and pushing certain snapshot images to our own registry. With lack of mechanism for more complete coverage, we only have very specific versions of snapshots.

I am wondering if it makes sense on a common repository (like this repo) to maintain such a mechanism to realize more snapshot version coverage. Maybe taking a snapshot per every public sync of each distro and push the image to the public registry. Having this will:

Building and tagging images by external trigger seems not complicated for hub.docker.com (e.g. this SO article) although I haven't tried yet.

Having said these, I'm not sure how cost effective it is to have such a feature, but if there's enough support for the idea I'm willing to contribute.

ruffsl commented 5 years ago

@130s , I suspect you would like to pin by its digest, or immutable identifier:

Pull an image by digest (immutable identifier)

In some cases you don’t want images to be updated to newer versions, but prefer to use a fixed version of an image. Docker enables you to pull an image by its digest. When pulling an image by digest, you specify exactly which version of an image to pull. Doing so, allows you to “pin” an image to that version, and guarantee that the image you’re using is always the same.

See this related comment here for more details: https://github.com/osrf/docker_images/issues/104#issuecomment-343583136

If you to dig through the archives to trace back any particular build (or "snapshot"), you may want to check out the Extended Information repo for the docker library: docker-library/repo-info.

Specifically, for ros:

https://github.com/docker-library/repo-info/tree/master/repos/ros

Maybe taking a snapshot per every public sync of each distro and push the image to the public registry.

In theory, I think the remote folder in the repo above should only receive new commits when there automated CI generates a new report after a tagged image in the library is updated. Aside from triggers from upstream images of debian or ubuntu, new images should only be pushed to the library when we update the versions of the metapackages, that our own CI is monitoring for. New releases traditionally bump these meta package versions, no?

Using the repo above, you should be able recover the image digest that correlates to a particular metapackage version release. E.g. I can see from the commit history that:

release of ros-melodic-ros-core=1.4.1-0 for arm64v8 points to ros@sha256:87b306bae05d99f6cf2e5d02134764bc8a86d7a1c2ca6208c1605e6b76a40f00
Or release of ros-melodic-ros-core=1.4.0-0 for arm64v8 points to ros@sha256:630a918dc8d1498fb1d8a18b64c5b51bd59c5c017489b7e0eca33cbfe77804d0

Using this method, you can also pull only up to a specific set of layers, e.g. dropping any number of layers in a tag you want to avoid in the tagged image; although I wouldn't recommend operating at that low of a level with your parent image declarations.

I suspect you would still argue for time like tags like with ubuntu (xenial-20180808), however ROS and Gazebo images are not FROM scratch images per say, given they build/trigger from upstream ubuntu and debian images, and thus we are not in complete control of our binary image-to-tag correspondence. Using the underlyed digest is perhaps the most thorough and robust method of pinning/designating the exact intended image.

ruffsl commented 5 years ago

@130s , please checkout https://github.com/osrf/docker_images/pull/204 This change should help spur our CI to churn the official docker image upon each sync. The images won't be time stamped tagged, the digest I described above would be more appropriate as a unique identifier. However there should be a one to one mapping of syncs to digests assuming we PR upstream to the library in pace with repo syncs.

130s commented 5 years ago

Looks like the unique identifier https://github.com/osrf/docker_images/issues/197#issuecomment-428749917 is good enough, for at least our usecases. But I'm also interested in more organized solution e.g. #204 (hence my comment https://github.com/osrf/docker_images/pull/204#issuecomment-440477448), as the unique id above doesn't seem to be tagged so it's not super user-friendly.

Do you think it's a good idea to 'document' about this workaround? If so, is wiki.ros.org/docker/Tutorials a good place to add that?

ruffsl commented 5 years ago

Do you think it's a good idea to 'document' about this workaround? If so, is wiki.ros.org/docker/Tutorials a good place to add that?

Yea, sound like a fine idea. Ping me if you'd like me to take a look at it.

ruffsl commented 2 years ago

Revesting this idea, we could simply add a date timestamp tag (like that used already for ubuntu) to the end of the existing ones that updates itself with respect to the ros sync dates. Eg.

Tags:

This time stamp could derived from rosdistro as referenced here:

https://discourse.ros.org/t/new-packages-for-ros-2-rolling-ridley-2022-01-28/24067/5

Pro:

Cons:

Questions:

cc @nuclearsandwich

nuclearsandwich commented 2 years ago

I still have general misgivings about providing named images that are guaranteed to be stable. I understand the drive to maintain control over pipelines but the way I see it, it makes it way too easy to rely on stale images.

The base image is of greater concern than the ROS overlay. I don't see how any kind of pinning in the Dockerfile would allow us to update the base image while keeping the ROS packages pinned since the main ROS repositories only store the latest version of packages.

For LTS ROS distributions which have snapshots published it would be possible to use the snapshots repository to build snapshot based images but that does not currently include Rolling or non-LTS releases of ROS 2. We may update the policy for non-LTS releases in the future but I'm pretty firmly against snapshots for Rolling given the amount of churn in the release policies by design.

Ryanf55 commented 10 months ago

Instead of pinning by date/time, a simple option I proposed internally is using tagging by docker image hash. This works with existing infra without any code, and it is also unique for each pushed image.

docker images -q ros:humble
>>> d5146462a4c1

Our team would like a way to reproduce previous deployments including all base packages from multiple different syncs.