jupyter / nbclient

A client library for executing notebooks. Formally nbconvert's ExecutePreprocessor
https://nbclient.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
151 stars 55 forks source link

1.0 Release Goals #69

Open MSeal opened 4 years ago

MSeal commented 4 years ago

I wanted to open up a discussion for what we consider needed for a 1.0 release. I think the core code is shaping up to be solid and getting tested well in a number of environments. Is there anything folks see as "should have" or "must have" for 1.0?

My starting note would be to make sure we have an interface pass for flags, old control planes that we maybe don't want to support because of how-it-was in nbconvert. Essential do a check if any config options don't make sense or should be added / removed.

Additionally I think making interface frictions like what came up here https://github.com/nteract/papermill/issues/490 less annoying and handled more automatically.

kevin-bates commented 4 years ago

@MSeal - Thanks for opening this discussion and including me.

My involvement with nbclient came entirely from the work @golf-player has been doing to make EG's RemoteKernelManager independent so that applications like nbclient can leverage remote kernels distributed across resource-managed clusters. As noted in #64, nbclient needs to give the kernel manager a crack at a graceful shutdown as well (not just kernel-client). This allows the various resource managers to "finish the application" before they too enter their kill sequences via immediate shutdowns (as necessary).

So, regarding this issue, I'd love to see #64 in 1.0 and think it would be great to offer that functionality, but don't really have experience with other aspects of nbclient for other insights.

NBClient seems very useful! Thank you (and the others) for maintaining it.

SylvainCorlay commented 4 years ago
choldgraf commented 4 years ago

My thought here is not specific to any one tech, but I think is important nonetheless:

I feel like we need more time for downstream projects to use, depend, and break nbclient in ways that we not expect. Many of the (often significant) bugs we've run into came from folks using nbclient in new ways, and surfacing those issues. This has been super helpful in hardening up the various edge-cases that we can work with, but I wouldn't be comfortable releasing a 1.0 until the rate at which we uncover these bugs decreases (there's no hard-and-fast rule here, but just a feeling I've got).

What I'd do is something like:

  1. Once we're happy with it, make a big announcement about nbclient but make clear that it's pre-1.0. Advertise it to other projects more heavily.
  2. See what kinds of bugs surface
  3. After we go for a month or so without a major bug popping up, consider releasing 1.0

does that make sense?

SylvainCorlay commented 4 years ago

Absolutely, I think that once we have a stable release of nbconvert and voilà using this without major issues for a little bit, we can bump major to 1.0.

MSeal commented 4 years ago

These are all good suggestions! Thanks everyone for adding thoughts here. We'll want to make issues and tag then with 1.0 milestone for anything that doesn't already have such an issue or PR up already.

We can collect these for a bit and then do a drive to 1.0 once most or all have been available and working for a bit. Sound like a reasonable plan?

davidbrochart commented 4 years ago

I agree to everything that has been said here, and don't have anything else to add :smile:

palewire commented 4 years ago

I'd put my +1 toward a simple CLI tool that lets users run notebooks from the terminal with zero hassle and configuration.

choldgraf commented 4 years ago

@palewire wanna give your thoughts over in here: #4 ?

SylvainCorlay commented 4 years ago

The 0.4.0a1 release appears to be doing OK for Voilà.

Any objection with tagging a 0.4.0 proper?

MSeal commented 4 years ago

I was just thinking the same thing last night. I'm good with a proper 0.4.0 release.

choldgraf commented 4 years ago

+1 from me - I do think that the ipywidgets issue is now resolved (at least I got it working in jupyter-book!)

davidbrochart commented 4 years ago

:+1:

MSeal commented 4 years ago

Done

choldgraf commented 4 years ago

wahooo! good job everybody :-)

SylvainCorlay commented 4 years ago

Quick note for releases:

I think that we should rather have a fast-forward commit dedicated to the release and labelled "Release M.m.p". Recently, we have been tagging merge commits, commits in PRs, etc...

SylvainCorlay commented 4 years ago

I think that we should rather have a fast-forward commit dedicated to the release and labelled "Release M.m.p". Recently, we have been tagging merge commits, commits in PRs, etc...

This occured in another repository today. I actually think we should make it a policy for release commits to be fast-forward commits, and not in PRs.

choldgraf commented 4 years ago

I have really enjoyed using GitHub actions and the "publish to PyPI" action for this. In the Jupyter Book repos, we've got it set up to trigger a PyPI publish any time someone creates a tagged release on GitHub. (e.g.: https://github.com/executablebooks/sphinx-book-theme/blob/master/.github/workflows/tests.yml#L59)

That way the steps to publish a new release are:

  1. Git pull latest changes from master
  2. Remove dev0 from version, commit with a release tag "RLS: M.m.p"
  3. Add back dev0 and bump the version
  4. Push to master
  5. Create a GitHub release that points to the "RLS: M.m.p" commit

and then all the rest is automated. I've found that it makes us much more likely to generate incremental and patch releases. If that would be helpful for nbclient I'm happy to add the action boilerplate. The one thing we'd need is to generate a PyPI secret key and add it to this repo.

SylvainCorlay commented 4 years ago

I have been bitten by broken automated release scripts in the past, and still have some scar tissue. IMO, we should be able to inspect the wheel and source tarball before uploading, as a last sanity check.

choldgraf commented 4 years ago

Yeah - I think it is a trade-off between "ability to inspect each release" and "ease and frequency of releases". For our projects since they don't require complex builds, it's worth it to give up that control to the GitHub action because it means that many, many more releases happen, which has really helped. We haven't run into any issues with builds...but then again if we did, it's just a 5 step process to cutting a new patch release.

That said, I think it also depends on the kinds of users we'd have, and I could see nbclient being used in far more conservative dependency chains (e.g., within companies) than jupyter book. So I could see us deciding the "manual" approach is better.

MSeal commented 4 years ago

Quick note for releases:

I think that we should rather have a fast-forward commit dedicated to the release and labelled "Release M.m.p". Recently, we have been tagging merge commits, commits in PRs, etc...

I believe we're normally doing this via bumpversion. The 0.4.0 release was a little special because we didn't have a bumpversion rule for to/from alpha releases so I did that one by hand. In general I've only had an issue with tags on non-dedicated commits in really big projects with lots of long-running PRs that have multi-file merge conflicts so I tend to not think about it so long as there is a tag for what commit represents the version correctly. I think moving forward the standard bumpversion pattern will be the norm regardless of automation of release or not.

I have really enjoyed using GitHub actions and the "publish to PyPI" action for this.

I'm neutral on if we must or must not have this. One thing to note is that I make more human mistakes (even with a script to follow) than the automation does once it's setup correctly, so it can give a sense of assurance that the release process will work as expected. But as with most of the low level jupyter libraries they are used widely in production code chains so having a speed bump requiring some thought isn't a bad thing either.

choldgraf commented 4 years ago

In our case, I would say that using the GitHub action has reduced the amount of mistakes associated with the release process, largely because the environment that is present when the package gets built is now deterministic and consistent (e.g. if two different people cut a release, the process is the exact same and independent of that person's local environment). The other benefit is that you can cause that job to only run after tests pass on CI/CD, so you can at least know that you're not pushing buggy code. I know all of that stuff are things that we should do manually each time, but in my experience, in practice people either take short-cuts, or releases happen much more infrequently