DigitalSlideArchive / digital_slide_archive

The official deployment of the Digital Slide Archive and HistomicsTK.
https://digitalslidearchive.github.io
Apache License 2.0
104 stars 49 forks source link

Version control #258

Open andreped opened 1 year ago

andreped commented 1 year ago

As DSA works wonders on our test servers, we were interested in deploying it on a larger server, but where security and robustness becomes more relevant.

As a pilot, we wanted to deploy it on a HUNT Cloud (HC) lab (see here for more info about HC). But after a meeting with HC, we became aware that there did not seem to be much version control, which is a big issue.

We were checking the docker repositories, and noticed that there only really seem to be a latest tag for the dsa_common repo: https://hub.docker.com/r/dsarchive/dsa_common/tags

However, the HistomicsTK docker repository seems to have better version control, likely because it is also a PyPI package: https://hub.docker.com/r/dsarchive/histomicstk/tags

As docker hub does not really have any way of reporting changes between tags, it would make sense that for each tag a new GitHub release was made. However, it does not seem that github releases are being made anymore: https://github.com/DigitalSlideArchive/digital_slide_archive/releases

Any comments regarding this, @manthey and @dgutman?

cc: @matuskosut from HC

manthey commented 1 year ago

You make a good point. The CI is set up to make tagged docker released any time we add a tag or release to Github. We haven't been doing so, mainly because the main funders of the project are all using the latest version and haven't had demand to do so.

I'd really like to set up semantic release for this (and a few other non-js repos) so that we can use commit messages to trigger when tagging is performed. The main drawback to semantic release is that it adds a burden on anyone making a PR to follow the appropriate labelling format.

There is a weekly CI cron job that republishes the dockers when it runs. This picks up changes in downstream packages. If the last commit is tagged, we'll have to make sure it doesn't repush to the tagged docker images (since that defeats the versioning of the tags).

In the short term, I can tag a release (but see the caveat in the previous paragraph).

andreped commented 1 year ago

In the short term, I can tag a release (but see the caveat in the previous paragraph).

But would you need to make a new tag for every single commit? Isn't it better to only update the tag when a new stable release is ready?

Anyways, just having some sense of versioning, which would enable us to know which versions of different docker repos and dependencies we are using at all times, would be very ensuring. Especially, if we over time want to move to a different server, it would be great to know that the same version of DSA is being used.

manthey commented 1 year ago

Because this is largely collecting and building dockers based on down-stream packages, the commits in this repo don't reflect all of the differences that can occur in a build. If we tag this repo and the as-then-built docker images, then that pair gets a consistent result -- that is, if you ask for the specific tag for the docker image in docker-compose, though you'd also want to specify the tags for other docker images: mongodb, memcached, and rabbitmq.

In some ideal world, when we decide the constellation of packages and images is "stable enough" for a release, we tag the release and generate a pinned docker-compose example. But, that doesn't reflect that our funded use cases don't currently have a need for that, so putting in the infrastructure to do that has no priority. Auto-tagging on commits would have tagged versions on docker hub. Otherwise, old versions are on circle-ci artifacts until they expire.