Closed keighrim closed 1 year ago
Talked about this yesterday with @marcverhagen and @kelleyl , and we all agreed on the problem and the solution. We also agreed that the version injection should happen at the runtime, specifically in __init__()
of ClamsApp
AB class.
Marc suggested using git describe
command to generate version strings.
git describe
This will work for local development environments, but might not be best for production images.
It would require:
.git
information copied into the image.
APP_VERSION=x.y.z
Then in something like version.py
or config.py
from os import environ
APP_VERSION = environ.get('APP_VESION')
# Defaults to None if not set
git describe ...
here?version.py
{{ github.ref }}
(or other tag info)git describe
There are automation tools to help with this. Something like versioneer
(I don't have any experience with this tool, just found it with a quick google search)
Very open to other options, but automation or environment variables have my vote so far.
tl;dr This conversation made me question the usefulness (or harmfulness) of the
app_version
in the app metadata. Scroll down to see my proposal (note; don't confuseapp_veriosn
andanalyzer_version
).Some background
app_version
andanalyzer_version
fields in the app metadata were direct adoption ofversion
andtoolVersion
from the lapps service metadata specification.-SNAPSHOT
versions that work as a throw-all version between proper releases. E.g. all compilations betweenv1.0.0
and, say,v.1.1.0
will usev1.1.0-SNAPSHOT
version number.Now, the problem
In clams apps, 1) we don't have the concept of "snapshot" (or "nightly" or "bleeding-edge" or whatever you call it), and the
app_version
value in the app metadata is 2) manually maintained by the app developer. I think this is an evil practice, as any source code pulled from thedevelop
branch of an app will report a falseapp_version
number (most likely from the previous stable release, if there was a merge from the stable branch) to the users.Proposal
App developers keep using git tags for stable releases of apps as we have been doing. However, developers should not maintain the
app_version
manually hard-coded in the source code (usually inapp.py
). Instead, theapp_version
value should be injected programmatically by either theclams-python
SDK (at the runtime) or the build process (at the build time).Implementation
As we are all using git for managing codebases of clams apps, and also expecting futures apps to use git (and github/gitlab) as well, I think we can write a simple logic that first looks for a git tag on the current source code tree (for stable versions) and when none found, falls back to the commit number. This will provide more fine-grained traces of which source code was used to produce certain annotations in the resulting MMIF. When there's not even a commit number (i.e. the code is running in a directory that does not have a
.git
directory), it finally falls back to some string (a short one likeunknown
or a longer one likethis-app-is-running-without-version-control-information-so-the-pipeline-cannot-guarantee-its-reproducibility
) so that users can recognize the "un"-reproducibility of the pipeline.Looking forward to hearing from others.