Closed adamjstewart closed 2 years ago
Hi Adam, this makes sense as an ask.
Do you have a preference as to how you would like to specify these hidden cells?
E.g. we could (a) inject some expressions into metadata tags, or (b) reference another notebook on the filesystem on include.
The former sounds simpler to me.
Ok, I'm having a think about this one for now.
Increasingly, I can see that it's necessary for nbmake users to customise execution in order to mock dependencies/config and assert things.
As a result, I'm considering exposing the nbclient
instance so that you can execute python code during the test.
From your perspective, this may mean you tag a cell with my_module.set_epochs
so you can invoke a python function to run this code.
Input welcome!
I'm not familiar with nbclient so I can't offer too much of an opinion of if this is the right thing to do. Is this akin to a hidden cell that runs only during testing, or more like an alternative cell that runs instead of the one seen by users?
We would write test code outside of the ipynb file that you are testing (e.g. in a python script or another notebook).
My assumption is that your users may want to run the ipynb file after seeing the docs, therefore it should not contain test code.
Yes, the ipynb file should be able to run normally without the test code.
@adamjstewart I have released v1.3.3a1
which allows you to mock variables after they are defined in a cell.
To do so, please add cell metadata to the cell which defines the variables. Nbmake will apply your mocks after the cell succeeds.
Example:
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbmake": {
"mock": {
"x": 2,
"y": "fish",
"z": {
"x": 42
}
}
}
},
"outputs": [],
"source": [
"x = 5\n",
"y = 'y'"
]
},
Please provide feedback! It will determine if we continue in this direction.
Will test this out when I get a chance!
Awesome, I've added docs in the readme and made this feature available in 1.3.3
Is your feature request related to a problem? Please describe.
Many notebooks are extremely time intensive to run. In machine learning, notebooks may involve downloading very large datasets or training a model over hundreds of epochs. A single notebook may take hours to run, but we would still like to test these notebooks quickly in CI.
Describe the solution you'd like
We already have nbmake support for skipping cells, it would be interesting to be able to add hidden cells that are executed only by nbmake. With this, I could have a normal cell like:
that is executed by all users, then a hidden cell containing:
that is only executed by nbmake. This would keep testing time down while keeping the notebook as simple as possible.
Hidden cells definitely aren't the only way to implement this. For my use case, I really just need to change the values of a couple of variables. If nbmake offered a way to do this in a configuration file where it automatically replaced variables with these values that would also suffice. Instead of a configuration file, I would be fine with including this metadata in the notebook itself.
My goal here is that the notebook appears as simple as possible and doesn't include any visible code that is specific to nbmake testing. Any solution that offers this is equally valid to me, I definitely welcome alternative proposals.
Describe alternatives you've considered
Some alternatives I've come across:
num_epochs = os.environ.get('NUM_EPOCHS', 100)
(ugly, doesn't work locally)Additional context
Some examples of these alternatives in the wild: