pllim commented 4 years ago

Blocked by:

[ ] Stakeholder surveys (see comments)
[x] #97

Suggested by: @eteq Expanded by: @pllim (feel free to edit if I got things wrong)

Motivations:

Lower the stakes if a high-priority PR or a killer feature did not make the current release deadline; i.e., users do not have to wait as long as 6 months to get them in the next release.
More frequent deadlines for contributors and maintainers who are primarily motivated by deadlines; i.e., instead of a big spike on feature freeze week, we turn it into smaller but more frequent spikes.

Possible new cadence (currently every 6 months):

Every month.
Every 2 months.
Every 3 months.

Requires:

Changes to APE 2.
Changes to core astropy release calendar.
Partial automation of the release process (see Risks).

Risks:

Increased burden for release manager. -- Can be partially solved by automation of the process.
Increased cadence lowering the importance of a feature freeze, encouraging inaction; e.g., "Oh, I can just wait till the next one."
The perception that we are "turning into pytest" with frequent releases and scaring pipeline people, who value stable packages.
Added effort on downstream packages and/or OS packagers to test the release candidate. (See https://github.com/astropy/project/issues/95#issuecomment-628775375)

96

astrofrog commented 4 years ago

I would be in favor of this but only if we are willing to embrace a significant level of automation in releases. I've thought about this a lot in the context of other projects, and I think many of the steps could be automated with a little work.

taldcroft commented 4 years ago

Big :+1: from my own perspective, and I definitely feel the 6-month wait is a real driver for the "I really want to push this last PR in" mentality. I don't think that risk 2 is very serious, especially if the new cadence is release per 3 months.

About pipeline people, they do have the option of LTS, or else just skipping intermediate releases. I know in my work we basically pin all packages until such time as we are ready to upgrade and do the appropriate testing.

Maybe adding to risks is added effort on downstream packages and/or OS packagers to test the release candidate.

pllim commented 4 years ago

Maybe adding to risks

@taldcroft , done!

saimn commented 4 years ago

3 months seems a good idea to reduce the peak of activity before a release, but I'm not sure about the impact on users. That means more versions to test and "validate" to make sure there is no regressions, and there can be a fear to upgrade because a the breaking changes. So this could cause more users to stay on the LTS version to avoid too frequent upgrades ? One option could be to have LTS versions every year instead of two years, and see intermediate releases as "unstable" ones, or more developer oriented. But that would mean more backporting work as well. It could useful to make a survey on the mailing list ?

hamogu commented 4 years ago

I think an LTS has to be available for more than a year. Unless we want to have multiple LTS versions around at he same time, we should not increase the LTS frequency.

I'm not sure how users upgrade astropy currently. For my non-development work, I typically make either (a) a new conda environment with current released versions of all packages that I anticipate using (i.e. I don't pin versions, so I just get the most recent ones) or (b) I reuse a some existing environment (i.e. an environment that I used for a previous project with similar requirements).

I think case (b) might be the most common. In that case, I only update astropy if I need a feature that I see in docs.astropy.org, but I don't have in my version or if I install some other package and conda or pip upgrade astropy for me.

Looking (in a non statistically representative way) at the co-workers in my group who are not developing code, workflow (b) is typical: Two examples: (1) CIAO (the Chandra analysis code) is released annually. People install CIAO and then put astropy into that environment and don't touch it again until the next CIAO is released. (2) I was recently asked why a certain example in the astropy docs did not work from them. It turned out that they were running astropy 1.0, which had all they needed (io.fits - I know users who don't use anything else from astropy and io.fits has not changed much in years).

I think the careful curation of versions and "users upgrading when a new version is available" is mostly limited to developers (who can deal with frequent releases) and curated institutional environments like instrument pipelines (that's what LTS is for).

So, this is to say: The astropy release schedule does not matter to most users and in fact more frequent releases might be slightly better, because users just get the "most recent" when they install.

I'm describing this in this much detail here because I think a survey on the mailing will get answers very biased towards developers/software interested people who behave different from the bulk of our users. Anecdotal evidence from more "typical" users around is is probably the best source for how "normal" users install and upgrade.

(Before you cry to educate me: My workflow is different to environments where I do development for astropy or some other package.)

astrofrog commented 4 years ago

Just to what I said before about automation - I think I would like to first see how we can make the current releases more automated, so that we get to the level of 'oh that was easy, why don't we do it more often', then consider changing the cadence :)

pllim commented 4 years ago

Yes, perhaps the automation part should be discussed separately in #97. I will update my post above to indicate that this is now blocked by #97.

taldcroft commented 4 years ago

About a survey, if it is posted to facebook / twitter we should get a more representative sample that includes typical astronomer users who are not devs. Agreed that astropy-dev / astropy are skewed. Also agreed to make sure more frequent releases are feasible from the dev perspective before making any noise.

bsipocz commented 4 years ago

I'm -100 on having more frequent releases as it creates more problems than it solves while don't really provide much for the users either:

the same rush of half baked PRs will be pushed through on release day, no matter whether the cycle is 6month or 3 month long. PRs will be merged with the caveat last comment that the outstanding 2-3 review comments, and docs additions can be fixed in a follow-up PR, but given the impending feature freeze the PR has to be merged "right now". Call me sceptical, but this was always the case even though we have talked about it a lot on telecons and on meetings.
we supposed to move forward in the direction of having a coordinated package framework. This coordination currently is not automated, and while many parts of it could be automate up, the 3 month release cycle is a very short one in this regard pushing unfair pressure for maintainers and developers of the coordinated packages and maintainers of this ecosystem coordination.
RC testing is a significant endeavour. We cannot realistically expect our downstream and packagers to get it done in a few days, having multiple RC means a minimum of a month or more from FF to actual release. This can be cut back significantly, but only if the dev branch is well tested. That branch is only well tested if there isn't a huge rush of features prior the FF deadline. (One especially cannot expect a short RC cycle if PRs merged in the last minute are not even compatible with each other, as was the case in the 4.1 feature freeze).
building the release and pushing to pypi is the most minor part of making a release. following up threads (those "docs/fix to know bug in this shiny new feautre is coming in a follow-up PR") and getting decisions from maintainers is more significant, and cannot be automated.
Most users don't care about new releases, but very happily use year old versions. I don't see that providing a 3 month cycle would improve anything for them expect broadcasting the message that astropy is not stable. We could certainly do more bug fix releases, but that would require to focus on fixing bugs rather than adding new features.
Any surveys on twitter/facebook feels an extremely bad approach as the status of the cutting edge, backwards compatibility, etc are mostly relevant for developers, mostly developers of downstream libraries and packages and pipelines, and maybe admins of departmental servers, science platforms etc, and not the random but very vocal grad students and postdocs are overwhelming the social media sample and who most probably have no insight or appreciation of a dev cycle of a large and complex library.

=========

Solutions:

Have regular work together days. These can imitate the team work feeling of the feature freeze days, while avoid any of the pitfalls of an actual feature freeze.
Have regular triage days. 5 maintainers sweeping through the issues and regularly close obsolete ones, and open bug fixes and most importantly do make decisions rather than postponing stating unsaid arguments would make wonders.
Have regular telecons. We are supposed to have them, but in practice we don't even have them doodled. We did have regular work together hours but it was very very poorly attended. Have a very concrete agenda, for an extremely well working example see the regular numpy calls.

==========

As for the release/FF I have a controversial approach in mind, but maybe some of you would like it. The real issue with the FF is not even the last minute rush. It's a problem, too see the conflicting merges and need for constant rebase on the last day. So the real issue is that hardly anything being merged at FF day had any chance to be tested downstream before branching out and making an RC. And that's a huge issue, as it's contradiction in our communication towards downstream packages where we ask them to test against master with the false promise that this is good practice to discover compatibility issues early and avoid the need for firefighting when the new astropy release is out.

So my suggestion is to have the branching day ~1month before FF. New features have the chance to be ported to the new branch if they are merged and tested before FF day. If they are not in that category, it means they'll be in the next release.

With this approach developers can deliver features for their own timelines and deadlines (see the spectralcoord case in 4.1) while ensuring that the release is not hold up due to the need of downstream testing and coordination. This approach would also remove the blanket policy that never really worked and was a source of large amount of frustration. I always felt very confident of merging a few very last minute PRs when I already saw that they are well tested, and also integrated downstream at the time the PR was opened, even if they've been a day late, while had no way to not include the rushed through ones that was knowingly causing troubles at the minute they are merged but nevertheless was merged to meet the deadline.

==

And to reiterate of how an ideal world of dev cycle would work, let's remember the case of last summer when both the big overhaul of modeling and large amount of improvements to units/table had been merged in the middle of the dev cycle in the summer, meaning that neither caused any issues at release time but provided well tested new/reworked functionality. Let's aim to find a framework where this approach is valued by all maintainers to the level they are willing to follow it themselves.

embray commented 4 years ago

I think if nothing else more frequent bug fix releases could be made feasible with at a minimum:

Better automation of releases--it should be possible to automatically cut a bug fix release directly from a CI pipeline when run on tags. A successful CI build of a bugfix tag should result automatically in a release.
The above should include testing against affiliated packages--bugfix releases obviously should not break affiliated packages.

With such improvements in place it might be possible to extend this to other types of releases, but I would focus only on more frequent bugfix releases first.

Still making the above work reliably, while possible, will require a few non-trivial changes to the development process (some of which are good to do anyways, such as improving management of the changelog). I am skeptical as to who exactly more frequent releases (without some specific institutional need) benefits.

pllim commented 4 years ago

Did you say change log? Please see astropy/astropy#10334 😉

astrofrog commented 4 years ago

I think what would definitely be beneficial is more frequent/automated bug fix releases - probably an easier place to start and clearer benefits.

pllim commented 4 years ago

Just in case you want to revisit the idea of GitHub Action, I stumbled upon https://github.com/pytest-dev/pytest/blob/master/scripts/release-on-comment.py 😉

pllim commented 3 years ago

Since this was opened, @astrofrog and @Cadair implemented automated release using Azure Pipeline. Not sure how relevant is this still.

bsipocz commented 3 years ago

Packaging up is still not the biggest burden for a release, but all the testing and downstream integration.

pllim commented 1 year ago

Superseded by:

https://github.com/astropy/astropy-APEs/pull/82

astrofrog commented 1 year ago

I wouldn't say this is superseded, as even with APE 21 we could still have more frequent releases - in fact APE 2 already says that we can have more frequent releases if we agree to it.

astropy / astropy-project

More frequent releases for core astropy #95

See Also

96