Open teutoburg opened 1 year ago
In either way, I propose to standardize on the following structure for the names of the tags: pkg_name.YYYY-MM-DD
, since that matches the names of the zip files on the server. So, e.g. MICADO.2023-07-11
Proposal for test and for standardized tags accepted.
I've created tags for MORFEO / MICADO / METIS on a963211 , which is on dev_master
, which is a bit annoying, but well, better that than to not have tags.
Some more weekend-thoughts on IRDB versioning and workflow. Disclaimer: The following is meant to be read as a formalized train of thought from A to B, hence the "draft", "conflict", "solution", "updated draft". Hope that makes sense...
pkg_name
refers to the name of each package, e.g. MICADO
, ELT
, Armazones
release="stable"
must always point to the most recent version without a .dev
suffix.release="latest"
must always point to the most recent version, regardless of any suffix.IRDB uses Calender Versioning to mark different versions for instrument packages, making it easy to keep track of changes across time. The scheme for the version number is pkg_name.YYYY-0M-0D
, using the notation from CalVer.
Square boxes indicate a commit that sets the version number in the version.yaml
file. The package is built and pushed to the server using the version in this commit. A tag is created on this commit with the package name and matching new version number.
gitGraph
commit type: HIGHLIGHT tag: "2023-04-20"
branch feature_1
checkout feature_1
commit
commit type: HIGHLIGHT tag: "2023-07-05.dev"
commit
commit type: HIGHLIGHT tag: "2023-07-10.dev"
checkout main
merge feature_1
commit type: HIGHLIGHT tag: "2023-07-11"
branch feature_2
checkout feature_2
commit
checkout main
merge feature_2
commit type: HIGHLIGHT tag: "2023-07-18"
branch feature_3
checkout feature_3
commit
commit type: HIGHLIGHT tag: "2023-07-25.dev"
An issue occurs when e.g. the following all happens on a single day:
gitGraph
commit type: HIGHLIGHT tag: "2023-04-20"
branch feature_1
checkout feature_1
commit
commit type: HIGHLIGHT tag: "2023-07-11.dev"
commit
commit type: HIGHLIGHT tag: "2023-07-11.dev??"
checkout main
merge feature_1
commit type: HIGHLIGHT tag: "2023-07-11"
branch feature_2
checkout feature_2
commit
checkout main
merge feature_2
commit type: HIGHLIGHT tag: "2023-07-11.dev??"
branch feature_3
checkout feature_3
commit
commit type: HIGHLIGHT tag: "2023-07-11.dev??"
How should the version numbers be assigned, such that the requirements outlined above are met at any point in this process?
A possibility could be to append an alphabetic _MICRO
to the version number, if a release with the same date already exists, regardless of dev
or not. The example above could then look like this:
gitGraph
commit type: HIGHLIGHT tag: "2023-04-20"
branch feature_1
checkout feature_1
commit
commit type: HIGHLIGHT tag: "2023-07-11.dev"
commit
commit type: HIGHLIGHT tag: "2023-07-11_a.dev"
checkout main
merge feature_1
commit type: HIGHLIGHT tag: "2023-07-11_b"
branch feature_2
checkout feature_2
commit
checkout main
merge feature_2
commit type: HIGHLIGHT tag: "2023-07-11_c"
branch feature_3
checkout feature_3
commit
commit type: HIGHLIGHT tag: "2023-07-11_d.dev"
IRDB uses Calender Versioning to mark different versions for instrument packages, making it easy to keep track of changes across time. The scheme for the version number is pkg_name.YYYY-0M-0D_MICRO
, using the notation from CalVer, where the final part _MICRO
is appended only if a release with the same date already exists, regardless of dev
or not. This _MICRO
suffix is alphabetical ([a-z]). As an example, assuming a package of the name mypackage
, a version number mypackage.2023-07-11_a
would be created, if a release mypackage.2023-07-11
or mypackage.2023-07-11.dev
already exists. Equally, a version number mypackage.2023-07-11_a.dev
would be created, if a release mypackage.2023-07-11
or mypackage.2023-07-11.dev
already exists.
In other words, the version number must match the regex ^([A-Za-z]*\.\d{4}\-(0[1-9]|1[012])\-(0[1-9]|[12][0-9]|3[01]))(\_[a-z])?(\.dev)?$
and the substring between the two dots must be unique across all versions of that package.
This is still compatible with my previous comment, so would not break what we did in the last few days.
Seems like a great plan! And, wow, these mermaid graphs look cool. 1 question with some suggestions and three minor comments.
Three comments first:
main
in the figures: maybe ditching dev_master
would be a great moment to ditch master
as well and go for main
. That might even make things less confusing. What do you think?irdb/packages.yaml
completely right?Now for the question: What does the version.yaml
contain for the circular commits? I suppose it is the same as the last square commit, and thus the previous tag. That old version could become a problem if someone somehow ends up using a non-tagged commit as an instrument package, and they then encounter a bug. E.g. someone checking out feature_1
when it was still at 1-48a2da3
(the first commit in feature_1
, before the .dev
release), or a supervisor unzipping a tarball sent by a student. Then that user could complain that version 2023-04-20
is broken, because that is the version in their version.yaml
, but in reality they are not running that stable version at all, but a modified version.
There are several ways to 'fix' that outdated-version problem:
version.yaml
) automatically when publishing the package. E.g. if there is a tag, the zip file gets a version.yaml
with the tag, and if there is no tag, the version will be based on the commit hash. This would keep the commit graph as above.main
branch or on the feature branches. That way the main
branch will never have an actual version number in the main
or feature branches, just a placeholder like develop
or so. The first commit graph below describes this option.-alpha
suffix. The second commit graph below describes this option.Options 2 and 3 above both have exactly one commit in the repository with a specific version in version.yaml
, namely those with the tags. The other commits just have a placeholder. Option 1 above has no commit with any version in it whatsoever. So it becomes more complicated to have a local directory with a version.yaml
that contains an incorrect version. My preference is probably the last option, but I can be persuaded otherwise. I currently feel I would like to have the outdated-version problem addressed one way or the other though. What do you think @teutoburg, is this a problem at all, and if so, (how) should we address it?
Below are two commit graph, mainly so I could exercise using mermaid.
The commit graph of your first figure, adapted for suggestion 2. This looks horrible in mermaid, but all the 'release branches' are supposed to be dead ends, so multiple of them could be on the same height to make the figure easier to read. I also tried to put the release branches for stable versions above the line for the main branch, but that doesn't seem to be possible.
gitGraph
commit
branch "2023-04-20"
commit type: HIGHLIGHT tag: "2023-04-20"
checkout main
branch feature_1
checkout feature_1
commit
branch "2023-07-05.dev"
commit type: HIGHLIGHT tag: "2023-07-05.dev"
checkout feature_1
commit
branch "2023-07-10.dev"
commit type: HIGHLIGHT tag: "2023-07-10.dev"
checkout main
merge feature_1
branch "2023-07-11"
commit type: HIGHLIGHT tag: "2023-07-11"
checkout main
branch feature_2
checkout feature_2
commit
checkout main
merge feature_2
branch "2023-07-18"
commit type: HIGHLIGHT tag: "2023-07-18"
checkout main
branch feature_3
checkout feature_3
commit
branch "2023-07-25.dev"
commit type: HIGHLIGHT tag: "2023-07-25.dev"
Commit graph for option 3. Here I used the REVERSE
type to indicate that the version is reset to the placeholder. (So in fact, it does actually reverse the previous commit.)
gitGraph
commit type: HIGHLIGHT tag: "2023-04-20"
commit type:REVERSE
branch feature_1
checkout feature_1
commit
commit type: HIGHLIGHT tag: "2023-07-05.dev"
commit type:REVERSE
commit
commit type: HIGHLIGHT tag: "2023-07-10.dev"
commit type:REVERSE
checkout main
merge feature_1
commit type: HIGHLIGHT tag: "2023-07-11"
commit type:REVERSE
branch feature_2
checkout feature_2
commit
checkout main
merge feature_2
commit type: HIGHLIGHT tag: "2023-07-18"
commit type:REVERSE
branch feature_3
checkout feature_3
commit
commit type: HIGHLIGHT tag: "2023-07-25.dev"
commit type:REVERSE
After bit more thought, I'm now in favor of option 1, never include a version in version.yaml
in the repository, but generate it from the tag on release.
Whatever scheme we do, I'm assuming it will be primarily automated. As in, all three options I give are complex enough to become error-prone when done manually. But I think complexity should only matter w.r.t. whether people can understand our flow. I'd rate the clarity of my three options from easy to hard as Option 1, Option 3, Option 2, simply by looking at the figures. So that would be a plus for option 1.
I realized there will always be other places with the version number than version.yaml
. E.g. the changelog, or notes in the documentation like added in version x.y
, deprecated in version x.y
, etc. And just before a release, these numbers can refer to the soon-to-be-released version. So option 1 above (with no version in version.yaml
in any commit), would still require a commit that puts the correct version number in place in the documentation.
My main reluctance to advocate for option 1 was that this will tie us to git: e.g. if you'd do a git archive
, it would be quite hard to figure out the version of that archive. But that is just not true: the changelog should have a top entry that summarizes the changes for the upcoming version (using a placeholder for the version string), and the next entry in the changelog should list the current-stable version. So you can trivially pinpoint the version of such a (hypothetical) git archive
. So that is a point in favor of option 1 for me.
Other projects seem to go for option 1 as well.
https://github.com/astropy/astropy/blob/main/astropy/version.py
Comments on your comments etc.:
Three comments first:
- Seeing
main
in the figures: maybe ditchingdev_master
would be a great moment to ditchmaster
as well and go formain
. That might even make things less confusing. What do you think?
That was my idea as well. I don't want to get into the whole main
vs. master
argument (frankly I don't really care). But I think it will be a good way to emphasize that we switch to a different workflow. And if it has the side-effect of being more "modern" with main
, all the better!
- I suppose we can ditch
irdb/packages.yaml
completely right?
Indeed. As mentioned (I think I mentioned it, but who knows) I've been working on a revised implementation of the package upload in the background, which doesn't use that file anymore. That's basically done except for tests. But I guess there might be a few modifications needed as a result of this discussion (which is good). Although I'm reluctant to deleting packages.yaml
from the server yet, because some users might still use a download function from a previous version of ScopeSim, that would break without that file. Some kind of warning would be worth it maybe...
- My approach to dealing with multiple releases in one day was to cheat and just use days from the future 😇. We could even go beyond the 31th of a month. Your plan is better.
Invalid dates might produce an error in sim.download_packages()
in scopesim>=0.6.0
...
Now for the question: What does the
version.yaml
contain for the circular commits? I suppose it is the same as the last square commit, and thus the previous tag. That old version could become a problem if someone somehow ends up using a non-tagged commit as an instrument package, and they then encounter a bug. E.g. someone checking outfeature_1
when it was still at1-48a2da3
(the first commit infeature_1
, before the.dev
release), or a supervisor unzipping a tarball sent by a student. Then that user could complain that version2023-04-20
is broken, because that is the version in theirversion.yaml
, but in reality they are not running that stable version at all, but a modified version.
I was working under the assumption that the enduser will only use packages installed from a .zip
file on the server, where the problem would not occur in that way, if the process is done (automated I should say) correctly. However, considering that sim.download_packages()
explicitly and on purpose supports installing directly from GitHub (which is the test that regularly throws 403s), that assumption might be problematic. It would be nice to know how many users actually do that though. Anyway, it shouldn't matter! So on to the solutions...
There are several ways to 'fix' that outdated-version problem:
- Not include the version number in the source code at all, and generate the version (that is,
version.yaml
) automatically when publishing the package. E.g. if there is a tag, the zip file gets aversion.yaml
with the tag, and if there is no tag, the version will be based on the commit hash. This would keep the commit graph as above.- Create the square commits each on their own branch, instead of on the
main
branch or on the feature branches. That way themain
branch will never have an actual version number in themain
or feature branches, just a placeholder likedevelop
or so. The first commit graph below describes this option.- Add a new commit immediately after the square commits, with an untagged and unreleased version number. That's kinda what I tried to do with ScopeSim etc by incrementing the patch version and adding the
-alpha
suffix. The second commit graph below describes this option.
When I started reading this, my first "gut feeling" made me gravitate towards option 1. Reading your next comment reassured me in that. Option 3 would be next-best for me.
I'm reluctant towards option 2, mostly because we would pollute our repo with an ever-growing number of branches. Even though those are dead-ends, if I understand correctly they would not be deleted, right? Maybe that's just personal preference, but I like keeping the number of branches low. For tags, it's a different story though.
I realized there will always be other places with the version number than
version.yaml
. E.g. the changelog, or notes in the documentation likeadded in version x.y
,deprecated in version x.y
, etc. And just before a release, these numbers can refer to the soon-to-be-released version. So option 1 above (with no version inversion.yaml
in any commit), would still require a commit that puts the correct version number in place in the documentation.
Most notably, the version number also appears in the name of the .zip
file on the IRDB server. Those are used by sim.download_packages()
(0.6.0
onwards) to resolve release="stable"
and release="latest"
(the download function being ignorant of what's going on with versions and tags in the repo). Which was a part of what motivated me to start this whole process that resulted in this discussion 🙃
But I'm confident we'll find a robust solution for e.g. version numbers in docs. That seems to be the smaller issue.
My main reluctance to advocate for option 1 was that this will tie us to git: e.g. if you'd do a
git archive
, it would be quite hard to figure out the version of that archive. But that is just not true: the changelog should have a top entry that summarizes the changes for the upcoming version (using a placeholder for the version string), and the next entry in the changelog should list the current-stable version. So you can trivially pinpoint the version of such a (hypothetical)git archive
. So that is a point in favor of option 1 for me.
Sounds ..... good?
I also tried to put the release branches for stable versions above the line for the main branch, but that doesn't seem to be possible.
That is possible, I've done it recently while pondering about the workflow for ScopeSim (also with the release branch^^). Although it ends up looking weird sometime, mostly in how the lines split and merge. I think that while it's possible, it's not really meant to be done I guess.
There used to be a test checking the version information in each package against the information in the
packages.yaml
file (at least for all packages present in that file). Since we deprecated the use of this file, more or less, these tests now fail when a package version is changed. In #115, I disables the test temporarily, but some form of it should probably still exist.My idea would be to use git tags for this. There will now be a separate tag for each package released in this "Summer 2023" release, and there should be new tags for any new release. The test could then:
a. read all the existing tags for each package, pick the newest, and check if the version set in that package's
version.yaml
file matches; or b. read the version info in each package'sversion.yaml
file, and check if there's a matching git tag.Additionally, the test could verify if the tag points to a commit that actually changed the
version.yaml
file, to make sure we're pointing to the right thing.@hugobuddel, thoughts?