jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.57k stars 390 forks source link

Time to release v0.2? #933

Closed choldgraf closed 5 years ago

choldgraf commented 5 years ago

I feel like we have added a lot of improvements over the last several months, and while many of those are not part of the official v0.2 milestone, I kinda feel like we should make a new release just to help us document the new things that have been added. What do other folks think about:

Any major "about to be merged" thing that we should include in v0.2 is fine w/ me too

betatim commented 5 years ago

Fine with me. If we cut a release we should check that all the dependencies (z2jh chart version, default repo2docker version) are set to the latest version.

More generally I wonder if there is any point to making releases as we seem to be doing very well with our rolling updates via the helm charts we publish (a bit like web browsers that just constantly update). I think OVH, GKE, Gesis and Pangeo (the big and active public BinderHubs) all have a form of henchbot. The less active BinderHubs probably just stick to the helm chart version that was current when they installed it? I know one private BinderHub at a large financial institution where someone over coffee said "we tend to keep up, maybe with a week or so lag". So I don't know anyone who is unhappy.

If we want to make releases I'd adopt a time based release schedule: we make a release (say) every three months. What is in is in, what isn't isn't. I think enough changes in 3 months that we can "justify" making a release. Or we go one more extreme and tag a release every few PRs. Maybe figuring out what the goal of making a release is will help decide this.

manics commented 5 years ago

The benefit of a release is it clearly states to people "this is stable and tested, upgrading won't break anything". If this is the case for the master branch then it makes sense to get rid of releases and maybe update the docs to state this? Or alternatively do what some other projects do and have micro-releases where every PR or few PRs is a release.

Otherwise the burden is on every admin of a binderhub to follow the repo closely so they know whether an update is a relatively safe minor change or if it could break their deployment.

choldgraf commented 5 years ago

I agree w/ both of you - because of the nature of the helm charts etc, most folks are sticking pretty close to master. That said, a couple of thoughts:

Given that, what if we went with something like:

4 month release cycles (so about 3 per year), bumping minor versions each time until we are ready for a 1.0 release. A release cycle basically looks like:

  1. Add issues that we think should go into the next 4 months of work to the cycle.
  2. 4 months pass
  3. After roughly 4 months, we take a quick triage to decide we wanna bang out any final issues, and anything that can't be finished in a few days, we push to the next release.
  4. Write a changelog etc, maybe a blog post, and make an "official" release.
betatim commented 5 years ago

I think adjusting our docs to make what ever policy we choose an officially stated policy is a no brainer, we should get onto that.

For PR purposes we can and should write a 3 or 4 monthly retrospective of what has been going on. I think we should do that even in the absence of releases.

I am less convinced that we should switch to N monthly releases with a promise of backwards compatibility. I'd rather declare each PR merge a mini release and provide tooling to help people upgrade "instantly". Providing a way to upgrade your deployment from 8months ago without trouble is mega difficult to pull off. If we wanted to have a chance we'd have to run this upgrade process in our CI all the time to check that it is possible and if not revert the change or fix it. I'm not sure we have the resources to do that :-/

I think investing in additional tooling that helps keep everyone together and in a small window of versions will make our life easier as well as that of those running binderhubs. The reason is that when something breaks the potential changes that caused the breakage are much smaller than N months of changes.

For planning purposes I think 4 months is too long a horizon. No one knows what the world will be like at the start of 2020 or what the priorities will be then (this is if we started now with 4monthly planning). I think more than broad goals is impossible to set as a plan for such a long time frame. But we should have a broad plan, no more than one paragraph long.

What makes it even harder is that there isn't anyone who works on Binder who can be told what to work on. Instead, trying to set a direction for the next few weeks by declaring "I will work on X, here are the first few steps, here are some more steps for after, join in if you find this interesting" has a higher likelihood of people joining in and the work actually getting done. So I would prefer it if those who work semi-regularly on Binder related software announced what they are focussing on for the next few weeks, shared their plan, steps, issues and invited others to join them.

If we start doing this consistently and find people are working towards opposite/incompatible goals then we can meet and reconcile the plans. My feeling is that to have a meeting were we set the topics to work on for the next N months we each need to have a personal plan that is used as input to the planning. I'd claim that right now none of us have such a personal plan or at least we haven't shared it.


A case in point is the repo2docker roadmap. There was a lot of discussion about the meta of the roadmap and very little input after maybe the first iteration. Even during the first iteration there was plenty of work that got done that wasn't on the roadmap or on items that were explicitly listed as "to be done later". It hasn't been updated in a long time and yet no one has ever asked why. Overall I think for projects constrained by how many people are actually contributing changes/moving the project around we can use very lightweight structures. Only once we see people sitting around bored with nothing to do or working against each other do we need more process around planning and then assigning the work.

ps. if you do have spare time and wondering what to contribute, let's chat in gitter or add an agenda item at the next team meeting "where would I be most useful to the project?"

choldgraf commented 5 years ago

I'd be fine with just writing an every four month update in that case. I'm not really thinking of this for the purpose of people wanting to only upgrade on a stable branch, but for the sake of signaling what the project has been working on. (eg next week we need to give an update about "what binder has done in the last 6 months and what it's thinking of doing in the next 6" and I realized the only way to do this is by reading between the lines of issues, PRs, and blog posts) to me the version numbers are just a convenient way to have checkpoints that trigger writing and reflection.

betatim commented 5 years ago

I think we've agreed to write a blog post every 3-4months about what has been happening as a way forward. Going off topic from "releases or not" now.

My answer to "the next six months" for a talk would be:

This is my personal priority list. I have no idea what others think about it or if this conflicts with what they want to do :-/

I think we should create an issue in the team-compass repo to discuss what you can/should say in particular about "the next six months". The situation you find yourself in is why I think we need to keep working on https://jupyterhub-team-compass.readthedocs.io/en/latest/talking.html because right now it suggests "only talk about future items if we've written them down somewhere".

choldgraf commented 5 years ago

I'll go hop on the other thread you opened to discuss there. I guess on this particular issue, we should close it if folks aren't 👍 about creating new 'official' releases...