Open petersilva opened 5 years ago
We would have to define what is considered stable for Sarracenia. Actually when the flow test is passing in one or many environment we consider (as an act of faith) that Sarracenia is stable.
But is it really the case ? Does it garantee that no user willl experience any major problem from their use case scenario ? What are those use case ? How many we support ? Is it posssible to list them all so we can control what is supported and what it is not ? If we can't, can we portrait our typical users from our major clients so we get such a list, then we may define what are the metrics and rules that define what would be a stable Sarracenia ?
Basically, the more we will have good and bad feedback from users that we analyse, consolidate into those functionnality, the more we will be able to measure which version is stable.
Actually, I'm sorry I only have unanswered question on that subject, and it will take for me more interactions with users to understand their real needs on that and to judge about what is stable.
Also what quality attribute Sarracenia is trying to fulfill in term of data exchange. Is it Reliability, Performance, Availability, Modularity, Testability... Also, how do we measure the goal we want to achieve with those and what are the priorities? After we will have a clear understanding on that, we will be set on whether our architecture is answering to all those needs and what needs to be fix at this higher level. Then we will know without a doubt that we have (or not) a stable version.
At least, this is how I learned to evaluate a project in software engineering, but this is not a one man job. Defining all this would involve the developper team, the management, the clients, the contributors, ... So, where do we start ?
background and current testing strategy:
developers make changes in issue branches, and never commit directly to master.
the developers run the flow_test, which includes all available unit test, as well as end to end integration testing of a reproducible network of all existing components in a variety of configurations. Perfect result on the flow test (all tests passed) is a condition for acceptance to master branch. The developer should pass the test on one laptop, and a second person on a different machine reproduces the flow_test results as a pre-condition to a merge to master.
developers are invited and very welcome to add to the unit tests and more configuration to the flow test to enhance coverage.
the analysts who deploy generally will not look at development snapshots.
there are px-dev machines, a cluster with some development configurations, that, I think Noureddine has set up to pull daily snapshots from master.
at some point, Peter decides a release is appropriate, and he goes through the procedure described in the Developers Guide to create it. This usually happens once or twice a month.
The following Wednesday after a new release is made, Noureddine has a procedure that updates hpfx machines (science and collab) to the latest release. Note: this is currently broken because we wanted an explicit manual install to cover the change amqplib--> amqp, so used the chance to make the package name more compliant. (python3-metpx-sarracenia -> metpx-sarracenia) So we have a one-time need for a manual install.
users on hpfx can try things out before the version goes further.
from that point analysts manually select the version to use on systems that are gradually more critical, more dev systems, more stage systems, and eventually operational systems.
It is typically months after a release is created that analysts doing operational deployments are comfortable. The operational deployments are typically meant for government-wide mission critical usage, and so caution is expected and prudent.
Once analysts are comfortable, they start recommending that version, and that is the answer sought to the which version is stable question.
There a major refactoring done at the end of 2017, basically completed by Jan. 2018, and releases after that point have essentially been bug-fixes*. Configurations for versions prior to that point have incompatibilities with >=2.18.01 (releases on or after January 2018.) so that the move to a current version needs care. Once at a recent version, all upgrades should be seamless. The only impact of upgrades should be getting bugfixes, and there is no reason for fear, but the analysts see the schism between before and after the re-factor, and it results in a great deal of caution.
In other words, the analysts don't necessarily believe the releases are just bugfixes, and the only thing that will convince is time, and making good releases. My guess is whatever investment we make in unit tests and the flow test will improve release quality and give analysts confidence in the released versions, but that takes time to prove.
ddsr*.cmc is the most critical cluster, and it is running a version from >= 2.18.10 (2018/October), having been upgraded at least once in 2018. So it is on the post 2018 bandwagon, and it since it has hundreds of configurations operating at very high volume, the version running there is likely the one we have the most confidence in.
For now, the only thing I can suggest is that we have a tag (that moves) and analysts vote on a stable version... we just move the pointer, as analysts opinions change. On the other hand, we have been in beta for a year or more, and at some point, we will probably just declare it stable and the releases should be rarer.
My perception is that the overriding primary concern of analysts trying to deploy is a fear of regressions and changes that affect their configurations (because testing for them is hard, potentially involving months of stabilizing.) Analysts will comment: but there is all sorts of new stuff in every release ... yes, they are either:
new features to address issues encountered in operations, ( #80, #106, #140, and some others with only internal issues ), and in the last three months, much more thorough testing on windows, and
to make usage more consistent and obvious ( #80 again, #25, #92, #31 ) almost always without changing anything existing in use.
work to address additional use cases ( #54 (for DMS), in February the v03 work to permit wider adoption, and compatibility with MQTT.) the v03 work, for example, should have no effect on the operational flows, which are entirely v02. Other contributions are in the form of plugins to address additional use cases, which only affect those uses cases (as no-one else is using those plugins.)
So the basic idea is that there are no changes that will affect existing configurations, except where such a change was explicitly requested by operational analysts ( #80 is, I think the only case of a change in config behaviour since Jan. 2018. )
So the new stuff is very conservative, and the analysts main concern is regressions. We do have an example of a regression, in v2.19.01b1, where in some cases remove does not work. The regression was introduced by a bugfix gone wrong, so there is still reason for analyst caution. That would appear to be the sole such example. The type of breaking changes the analysts are looking for are documented in doc/UPGRADING.rst, and there is little such information from versions in 2018 precisely because very few versions had any sort of breaking change.
another regression in v2.19.04 releases... fixed by v2.19.09
another regression was the timeout in accelerator plugins in v2.19.09b1 fixed by v2.19.09b2.
another regression is #268 introduced in 2.19.09b1... fixed in git (not released.) got accepted even thought python3.4 was failing on travis.com as we get more confident in travis, we should heed it more.
The current stable version is 2.20.02b1. It has no known regressions and is now widely deployed in critical and complex configurations.
I just added a stable tag pointing to v2.20.02b1. We can move it whenever it makes sense.
I just moved the stable tag to point to v2.20.02b3. b1 actually has a bad bug #313 likely present since at 2.19.04 but testing changes prevent git bisect.
sigh... because of #318 we are moving the stable pointer back to 2.18.10b2...
OK, in light of the start of v3, and consideration of regressions seen, have created some new strategy described here:
development with same QA tests still occurs on master branch.
master branch feeds Daily repository as before.
there is a new Pre-Release repository, also based on the master branch, that should be used on systems that are used, but not necessarily as sensitive as other systems... they can tolerate some testing.
the old stable repository is now based on a new branch, called v2_stable. This branch is updated from the master branch using release tags, so that it merely promotes the version tested in pre-release.
more information here: https://github.com/MetPX/sarracenia/blob/master/doc/Dev.rst#repositories
current stable version is v2.20.08p1 (or post1 ... slight error in release results in pypi using post1, and debian p1)
discussion:
Currently discussion is around sr3... and sr3 has not reached production maturity yet... working towards a stable release, which means a rapid deployment process as issues are discovered. Later there should be a slower rhythm of releases. So .. For now... we should be using pre-release everywhere... until we get to a real stable version.
further work:
"rc1" version suffix seems to do the right thing for pre-releases on pypi, so used for launchpad and github as well.
v3 should be considered stable at this point, and v2 should be considered legacy.
v3.0.56 is the current stable version. For sarrac (package name metpx-sr3c): 3.24.11
as of 2024/11/20:
Current python stable version: 3.0.56
Current C stable version: 3.24.11
v2 is considered legacy. Anyone updating a v2 configuration is encouraged to migrate to sr3.