Automate and streamline docker image generation

baentsch commented 4 months ago

Upon merge to "main" or when triggered by upstream build/release events, this project generates ready-made docker images at docker hub for users interested in "just using" OQS.

Currently, manual intervention right at the moment of completion of downstream CI is necessary to tag a specific docker image corresponding to a specific oqs-provider/liboqs release when the docker image has been created by CircleCI. If this moment is missed, the specific version needs to be manually created with the specific desired oqsprovider/liboqs version manually inserted to the Dockerfile(s) as subsequent CI runs overwrite the "latest" docker image. If using Github CI, a simple re-labeling of the correct commit is required as this commit ID is stored at docker hub (right now only for "openquantumsafe/openssl3").

This issue is to suggest streamlining and fully automating this:

Generate docker images for all still supported OQS integrations via GH CI, moving away from CCI
Upon an upstream GH release event, automatically tag the generated docker images with the upstream version ID

Tagging @ryjones fyi as I learned from @dstebila that you seem to know "everything github" and this issue may be straightforward for you to implement.

baentsch commented 3 months ago

Also tagging @ajbozarth as this may be another area where an update as part of taking "a fresh look at things" could be beneficial to the project -- besides, all automation helps a team with too few people contributing.

ryjones commented 3 months ago

What would need to be done is create a workflow for releases, which includes a tagging component. I will look for a sample.

planetf1 commented 3 months ago

I'm trying to understand the context in use of down/upstream here & the various events that occur + deliverable required?

Triggers:

when liboqs/oqsprovider gets released? Anything else? This could utilize the release workflow of liboqs etc. Alternatively a polling mechanism (less efficient, though simpler and a quick test)
How about any other public dependencies (dependabot can help here in some cases)

Actions:

Is the requirement to rebuild Docker images, say of each demo?

Tagging

What naming convention is desired for the image tags - the Dockerfiles will change as that related project evolves, as well as its dependencies.

baentsch commented 3 months ago

Triggers: At any "meaningful" change, not just releases -- we want to know before releases if we goofed somehow/somewhere in our stack for the integrations.

Actions: The images (of each supported integration) only need to be tagged to a specific liboqs release if/as such release occurs. Otherwise, "latest" (image) refers to just that -- bleeding edge liboqs (+oqsprovider if applicable)

Tagging: Good question: We have 2-3 version IDs at play: a) upstream, b) liboqs, (possibly c) oqsprovider). Right now, I only always (manually :( tagged to liboqs version as that's the most "interesting/different" thing to these images (and oqsprovider release either can be inferred or read from the running image -- just like the main app's version; liboqs version typically cannot be manually read out any more, hence it's in the tag). Any other proposals welcome.

ryjones commented 3 months ago

@baentsch It looks like you use both tags and releases.

What would your ideal process look like? You could have multiple tags in a release, each one being descriptive:

oqsdemos-snapshot-2023-10 liboqs-0.9.0 oqs-provider-0.5.2 oqs-openssh-OQS-OpenSSH-snapshot-2023-10, for example, for the latest release

Alternatively, you could use an annotated tag

git tag -a 0.9.0 -m "oqsdemos-snapshot-2023-10 liboqs-0.9.0 oqs-provider-0.5.2 oqs-openssh-OQS-OpenSSH-snapshot-2023-10"

baentsch commented 3 months ago

@ryjones Thanks for this proposal. If I get it right, it refers to the git tag(s) of the project, correct? This indeed could be a worthwhile enhancement to the GH referencing of releases. Please feel free applying such tags with your admin rights; I'm tired of running into the "LF wall of distrust" (trying out things in GH just to detect I don't have the necessary permissions any more).

Anyway a) this issue is about the tags of the resultant (various, different, per-integration) docker images under which they are listed on docker hub and b) I ceased doing GH releases due to lack of external interest & contribution in this GH project (and, admittedly, due to time "siphoned off" commenting on LF documents and proposals).

This issue primarily is about correctly automatically tagging the output of this project, i.e., the docker images, pertaining to releases of the GH projects still minimally "looked after" (wouldn't really dare call it "maintained" any more) despite the many unknowns and hurdles created , i.e., liboqs and oqsprovider.

The rationale behind the issue: By being able to properly reference a suitably tagged docker image, triage of user bug reports/ differentiating between a user's local build problems and real OQS code problems can be done by way of requesting use of the proper docker image by the bug reporter for bug reproduction (or not).

ryjones commented 3 months ago

@baentsch : you have write access, which gives you the power to Create and edit releases

planetf1 commented 1 month ago

An auto-generated tag which includes a version (auto incremented), sha, or date, could be used in CI automatically.

We can also use 'latest' if we want but it may be less helpful

I've seen some images have a variety of tags like

test latest-release v3.9 v3

Many may point to the same place, but we just use the tag for intent (as @baentsch refers to)

It's also possible to include additional metadata in the image, whether as per OCI or custom. These can include build related values, and be queried when the container is used if needed for debugging.

planetf1 commented 1 month ago

Taking the latest curl PR as an example, we do have a number of variable components as mentioned - including liboqs, alpine, openssl versions, as well as build variations.

Since these are demos and not expected to be used in real systems by non-developers, a question I'd ask is, do we actually need to publish multiple variations to dockerhub or other image repositories? Could we just go with 'latest available, tested' - an opinionated selection. We could then either go with a simple semantic version, or adopt a pattern that does include important dependencies - but there it's more about easy identification. May overlap with text we can inject as additional metadata

CI is another matter - there we should have more combinations that can be built using args, tested automatically. That version matrix should be documented for users (even if it means pointing to a fragment of the ci scripts..)

Another possible value-add is image scanning. One lightweight approach I used on a previous project was to publish to quay.io as well as dockerhub as this offers free image scanning. (if relevant can open as a new issue)

planetf1 commented 1 month ago

Added issue on image scanning

baentsch commented 1 month ago

we do have a number of variable components as mentioned - including liboqs, alpine, openssl versions, as well as build variations

That's true since many years: we had these many (build) parameters regarding the different components pretty much from day 1; the latest curl PR is not special in this regard by adding one more for OpenSSL and changing some other values. Hence,

Since these are demos and not expected to be used in real systems by non-developers, a question I'd ask is, do we actually need to publish multiple variations to dockerhub or other image repositories?

I wouldn't think so. (Accordingly ? :) we never did. All we retained is one labelled image tagged to a specific old/downlevel liboqs versions per key integration (basically manually tagged at release time of the lib -- the automation of which is the very purpose of this issue). Whole purpose of retaining those was to allow people to play with "downlevel" code or old algs OQS decided to drop from continued maintenance: This was a mechanism/way allowing the small OQS team to more easily "let go" of supporting code in liboqs that's no longer relevant (e.g., for algs dropped out of the NIST competition). If many more people were now actively joining the project, this could be revisited -- but I don't see this happen; the opposite actually -- plus many more urgent issues to address...

ajbozarth commented 1 month ago

Follow up on discussion at the weeks OQS call, as a subset of this issue, specifically looking at how we tag docker images.

My PR #298 will start pinning the versions of libraries like liboqs, oqs-provider, openssl rather than using main/master. This will change the latest docker image for that demo to use specific tested versions instead of pull the most recent code as of a CI run.

I agree with @baentsch point above on keeping simple with just one docker image published, but I do want to raise the idea of building a second image using main/master for trying out dev features. If this second image interests the community we would want to decide on how we label it, I would pitch *-dev but that naming is actually already used in some dockerfile as a target.

baentsch commented 1 month ago

I do want to raise the idea of building a second image using main/master for trying out dev features.

This is a great idea!! This is the current state of things and I have asked maintaining that from our very first discussion several months ago and keep repeating it constantly, e.g., here.

Technically this is IMO trivial to do as all "pinned versions" are just changes to docker build arguments. All CI needs to do is run docker build with different args. One more line of code.

Reminder if it got lost over time: One key purpose of these integrations is to serve the OQS core team as "canary" (as they have been doing until you wanted to pin stuff to downlevel versions), so such builds are a necessity. I understood though you didn't want to do that in https://github.com/open-quantum-safe/oqs-demos/pull/298 but rather as more PRs.

Again, for the record, I don't like the approach of "many PRs for the same purpose" as it creates "PR fatique" and consequent risk of sloppiness on the side of reviewers. When I changed baseline features like this I applied them to all demos in one go to allow reviewers to understand and spot mistakes more easily (and save time both to reviewers and myself: Please be considerate of that issue: I'm not working on a fixed salary but need to be efficient in my use of time).

Whether to also store such "main tracker" images on docker hub is a separate question. I tend to think that's not necessary as anyone willing to work with bleeding edge software should also be able to run docker build.

I would pitch *-dev but that naming is actually already used in some dockerfile as a target.

And also published as such on docker hub. And Yes, purpose of this exactly was to allow the few guys to have a "bleeding edge" image. I'm unaware whether it's been used by anyone, ever.

ajbozarth commented 1 month ago

When I changed baseline features like this I applied them to all demos in one go

@baentsch I'm willing to leave #298 open and add the updating of the rest of the demos to it if that's what you prefer as the primary reviewer. I'm flexible with 1 large PR vs a series of PRs. If so I would just add each demo update as I do them and I could ping you for review once they're all in

I tend to think that's not necessary as anyone willing to work with bleeding edge software should also be able to run docker build.

This lines up with my personal opinion as well

open-quantum-safe / oqs-demos

Automate and streamline docker image generation #284