treebeardtech / nbmake

📝 Pytest plugin for testing notebooks
https://pypi.org/project/nbmake/
Apache License 2.0
182 stars 18 forks source link

Support for hidden cells executed only at test time #72

Closed adamjstewart closed 2 years ago

adamjstewart commented 2 years ago

Is your feature request related to a problem? Please describe.

Many notebooks are extremely time intensive to run. In machine learning, notebooks may involve downloading very large datasets or training a model over hundreds of epochs. A single notebook may take hours to run, but we would still like to test these notebooks quickly in CI.

Describe the solution you'd like

We already have nbmake support for skipping cells, it would be interesting to be able to add hidden cells that are executed only by nbmake. With this, I could have a normal cell like:

download = True
num_epochs = 100

that is executed by all users, then a hidden cell containing:

download = False
num_epochs = 1

that is only executed by nbmake. This would keep testing time down while keeping the notebook as simple as possible.

Hidden cells definitely aren't the only way to implement this. For my use case, I really just need to change the values of a couple of variables. If nbmake offered a way to do this in a configuration file where it automatically replaced variables with these values that would also suffice. Instead of a configuration file, I would be fine with including this metadata in the notebook itself.

My goal here is that the notebook appears as simple as possible and doesn't include any visible code that is specific to nbmake testing. Any solution that offers this is equally valid to me, I definitely welcome alternative proposals.

Describe alternatives you've considered

Some alternatives I've come across:

Additional context

Some examples of these alternatives in the wild:

alex-treebeard commented 2 years ago

Hi Adam, this makes sense as an ask.

Do you have a preference as to how you would like to specify these hidden cells?

E.g. we could (a) inject some expressions into metadata tags, or (b) reference another notebook on the filesystem on include.

adamjstewart commented 2 years ago

The former sounds simpler to me.

alex-treebeard commented 2 years ago

Ok, I'm having a think about this one for now.

Increasingly, I can see that it's necessary for nbmake users to customise execution in order to mock dependencies/config and assert things.

As a result, I'm considering exposing the nbclient instance so that you can execute python code during the test.

From your perspective, this may mean you tag a cell with my_module.set_epochs so you can invoke a python function to run this code.

Input welcome!

adamjstewart commented 2 years ago

I'm not familiar with nbclient so I can't offer too much of an opinion of if this is the right thing to do. Is this akin to a hidden cell that runs only during testing, or more like an alternative cell that runs instead of the one seen by users?

alex-treebeard commented 2 years ago

We would write test code outside of the ipynb file that you are testing (e.g. in a python script or another notebook).

My assumption is that your users may want to run the ipynb file after seeing the docs, therefore it should not contain test code.

adamjstewart commented 2 years ago

Yes, the ipynb file should be able to run normally without the test code.

alex-treebeard commented 2 years ago

@adamjstewart I have released v1.3.3a1 which allows you to mock variables after they are defined in a cell.

To do so, please add cell metadata to the cell which defines the variables. Nbmake will apply your mocks after the cell succeeds.

Example:

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbmake": {
     "mock": {
      "x": 2,
      "y": "fish",
      "z": {
       "x": 42
      }
     }
    }
   },
   "outputs": [],
   "source": [
    "x = 5\n",
    "y = 'y'"
   ]
  },

Please provide feedback! It will determine if we continue in this direction.

adamjstewart commented 2 years ago

Will test this out when I get a chance!

alex-treebeard commented 2 years ago

Awesome, I've added docs in the readme and made this feature available in 1.3.3