ReviewNB / support

Issues and feature requests for ReviewNB
https://reviewnb.com
58 stars 8 forks source link

CI Pipeline + test framework for Jupyter Notebooks #19

Open amit1rrr opened 5 years ago

amit1rrr commented 5 years ago

Problem

Jupyter Notebooks neither have a dedicated test framework nor a CI pipeline. Which results in,

Users tend to jump around Notebook cells while developing, resulting in an unpredictable kernel state. Given Notebooks are prone to muddled state, importance of testing & continuous integration increases multifold.

Solution

Here are the bare minimum things we need,

All of the above is possible today by tinkering with separate tools such as Doctest, unittest, papermill etc. We need to combine it all, fill the gaps and make an open source testing framework dedicated to testing Jupyter Notebooks.

As a last step, this testing framework can then be integrated into ReviewNB interface to run tests after each commit is pushed for users who wish to enable CI pipeline on their repos.


Feel free to upvote/downvote the issue indicating whether you think this is useful feature or not. I also welcome additional questions/comments/discussion on the issue.

amit1rrr commented 5 years ago

The test framework is now available here: https://github.com/ReviewNB/treon

nickponvert commented 5 years ago

Hi, any prediction for when this will be a part of ReviewNB? I am trying to decide if I set up a different CI system to run treon or wait for integration. Thanks! Awesome work!

amit1rrr commented 5 years ago

@nickponvert I would recommend setting up your CI to use treon as of now since realistically I'm a few months away from delivering a CI pipeline in ReviewNB. There are still some open questions in delivering this feature,

  1. How should we enable CI for all repositories (including free open source ones) when the compute cost for it would be significant.
  2. How to detect and avoid abuse of our resources as part of CI (e.g. someone mining bitcoins in their notebooks).

I've considered making it premium only feature or BYOC (Bring Your Own Compute) but not particularly happy with either of these. One viable path is to restrict free open source repositories to only one parallel build at any time with timeout and cool-off period.

If you or anyone from the community have inputs on this, I'm all ears!

amit1rrr commented 5 years ago

Saturncloud is running into the exact same problem here: https://discourse.jupyter.org/t/bitcoin-mining-abuse-security/1367

There are some solid recommendations for best practises in that thread.