Closed tschm closed 1 year ago
hi @tschm, that sounds like a good feature idea.
It sounds like it should work something like this:
pytest --nbmake
will ignore everything beyond that cellHow does that sound?
Some q's for you:
I have hundreds of notebooks. Unfortunately, we can't enforce any structure. We give our researchers freedom here. Skills of developers cover a wide range. The nbs accompany ideas in quantitative trading. I would recommend to study a bit what this ipynb project is doing exactly. The annotation of cells should be avoided and somewhat imports can take place very late... With beyond the point I didn't really mean the geographic location in the notebook. I just want to avoid to execute the notebook (some of them run for like 2h+). My goal is only to scan the notebooks and make sure all the imported packages are in the virtual environment.
I see, thanks for clarifying (never came across that package before).
So you have a basic implementation of this feature with your verify
function it seems.
Presumably, the main issues are the lack of pytest integration so that
pytest-xdist
Please let me know if I've misread your thinking.
Reckon we should have a flag that runs tests in import mode only? e.g.
pytest --nbmake --ipynb-import
# Only imports the notebook with importlib.import_module(f'ipynb.fs.defs.{file}')
Yes, you phrased it so much better than I ever could :-). Thank you. I have used nbmake before and it's a great tool, well maintained and documented. I assume this kind of functionality is a tiny step for you.
I have placed my little script in a container and integrated it into our standard ci/cd pipeline. But the route via pytest and nbmake would be my first choice...
There are a few odd bits in the current ipynb based setup. I need an empty init.py file together with my notebooks to open the route towards making them a package. It would also run the class definitions. However, code like Base = declarative_base() would not be executed prior unless I change to BASE = declarative_base().
awesome -- I'll put together a minimal feature next week for you to try out.
BTW: Do you have production systems depending on importing these notebooks using the ipynb
library?
It may be worth checking out alternatives that are more recently developed e.g. https://github.com/deathbeds/importnb/issues because ipynb has not had a commit for 5 years+
Exactly, I need a fresher and more robust alternative. Hence my interest in nbmake. Notebooks haven't really made it into production here (yet) but that doesn't mean they shouldn't be robust and tested. Of course the import test is only a first crude hurdle. Happy to act as Guinea pig
I have also noticed that ipynb wouldn't like %%time or !pip install ...
I've released v1.3.5 trying something slightly different to what we discussed.
You can now run notebooks and only report ModuleNotFoundErrors
pip install -U nbmake==1.3.5
Import errors can occur anywhere in a program so I'd rather we test imports more empirically than analytically.
Setting a cell timeout also lets us skip long running cells which are unlikely to hide imports at the end anyway.
Try it out like this if possible, please let me know how it goes!
pytest --nbmake --nbmake-find-import-errors --nbmake-timeout=20
Next steps:
My first experiments are all very promising. Thank you so much for your work on this problem. I have already identified a few notebooks importing libs they don't have in their underlying environment. At the same time your solution is minimally invasive. No need to alter the metadata of cells.
great, thanks for collaborating on this @tschm -- feel free to open more issues as needed.
I want to iterate over tons of notebooks but without executing them beyond the point of importing their dependencies (e.g. and raise an error if they author is importing a package not in the environment).
Currently I have:
Please find the ipynb package here: https://pypi.org/project/ipynb/