Add unit test code coverage for *.ipynb files with nbdev

doyu commented 1 year ago

https://github.com/Origami-TinyML/tflm_hello_world/blob/main/README.md#sprint-1

Some discussion about how to introduce unit test coverage to show with nbdev, https://forums.fast.ai/t/nbdev-code-coverage-n-tests/73993

doyu commented 1 year ago

https://pete88b.github.io/decision_tree/test_coverage/

doyu commented 1 year ago

Borna's original:

I have recently been exploring the use of nbdev and Jupyter Notebooks. I attempted to add nbdev to my repository, but I am unsure if I have done so correctly.

My fork of the repository can be found here: https://github.com/lifeofborna/tflm_hello_world.

I am now wondering which specific .ipynb files are meant by "Add unit test code coverage for *.ipynb files." I would greatly appreciate any tips or suggestions on how to properly add nbdev to my repository, and how to proceed with adding code coverage for the .ipynb files. Thank you.

https://github.com/Origami-TinyML/tflm_hello_world/issues/35#issue-1556546028

doyu commented 1 year ago

You may want to use a jupyter notebook in hello world training section, for example.

lifeofborna commented 1 year ago

I tried to implement this :

https://pete88b.github.io/decision_tree/test_coverage/

I made a testcoverage.py script that's said in the article in the root of the repository. I created a virtual environment where I pip-installed nbdev.

But when I try to run this testcoverage.py script it crashes because it doesn't recognize some of the functions such as the guy mentions in his article "split_flags_and_code". Even though I have the correct imports and they're recognized.

He says I should use the newest version of nbdev which I'm quite sure I'm using (nbdev==2.3.9) since I freshly pip-installed it in the virtual environment. https://pypi.org/project/nbdev/

I'm not sure how to proceed from here. I still have time tomorrow to continue on this for this week but sadly most of my time today went to setting up the nbdev repo (task 1) since my task depended on it.

doyu commented 1 year ago

I'm not sure how to proceed from here. I still have time tomorrow to continue on this for this week but sadly most of my time today went to setting up the nbdev repo (task 1) since my task depended on it.

@lifeofborna you could share what you've done with @nellatuulikki so that she won't need to do the same thing again. Please send a PR for @nellatuulikki 's issue, at first. @nellatuulikki could verify if this works or not.

To solve your problem, you may want to do the following:

Create a new repo which has minimal files only to verify unit test under nbdev.
Reproduce the problem in this minimal setup
Send a PR to describe what's wrong there
Ask for some help in the discord with this issue link.

nbdev2 has been released so that pip install should install version 2+ nowadays.

@lifeofborna & @nellatuulikki could work on this together tomorrow.

doyu commented 1 year ago

Probably you could use this tutorial example with adding unit test to reproduce the problem you have.

https://github.com/fastai/nbdev-hello-world from https://nbdev.fast.ai/tutorials/tutorial.html

BTW, can you find "split_flags_and_code" in the latest source code? Is there any possibility of being renamed? You could ask the auther about this too if you cannot find out how it's renamed.

lifeofborna commented 1 year ago

Moved this from WIP, since we need to initialize nbdev to the repo first correctly.

nellatuulikki commented 1 year ago

Me and @lifeofborna tried to implement unit testing following https://pete88b.github.io/decision_tree/test_coverage/ instructions to https://github.com/fastai/nbdev-hello-world but we still didn't manage to import split_flags_and_code from nbdev.

We looked into the source code and we didn't find any usage of split_flags_and_code in nbdev.export. However, we looked into the older version of nbdev (https://github.com/fastai/nbdev1/blob/master/nbdev/export.py) and noticed that here they have function called split_flags_and_code. However, this export.py module has been completely rewritten in new nbdev (current version of export module https://github.com/fastai/nbdev/blob/master/nbdev/export.py).

So question is should we keep trying to implement unit test code coverage report based on pete88b's instructions and find out if we could use other functions instead of split_flags_and_code by creating an issue to nbdev or should we try to think other way to implement coverage report.

doyu commented 1 year ago

Hi @nellatuulikki

split_flags_and_code seems to do "Splits the source of a cell into 2 parts and returns (flags, code)". This looks just spliting 'code' and '# comment'???

Can we just copy this function in pete88b's code and make it work?

Under nbdev, unit test is just a code to run in the jupyter notebook. I'm not sure if there's other python unit test coverage could be easily applied or not. Probably just trying to run the above could be the easiest? The missing function itself doesn't seem to be too big or complicated either?

doyu commented 1 year ago

And I asked :) https://forums.fast.ai/t/nbdev-code-coverage-n-tests/73993/26

nellatuulikki commented 1 year ago

Thanks for a quick answer. We can try to add the function from nbdev1 to pete88's code.

doyu commented 1 year ago

I'll also join this debugging tomorrow ;)

nellatuulikki commented 1 year ago

Me and @lifeofborna had a look into this https://github.com/pete88b/decision_tree/blob/master/test_nbs.py code and it seem to require a lot of rewriting of the code. We don't know how long or even if it's possible to use that code any longer.

We are exploring alternative ways to implement unit test code coverage. We found this youtube video (https://www.youtube.com/watch?v=Pf1ADyUKOrE) and we were thinking should we try to implement testing following instructions on this video.

lifeofborna commented 1 year ago

And a follow-up question. What is the purpose of testing the Jupyter notebooks? Are we only checking if they run without crashing or are we conducting more specific unit tests?

I found that a library called nbval would be the best fitting way to conduct these tests: https://nbval.readthedocs.io/en/latest/ which utilized pytest. We should be able to add coverage for it as well.

doyu commented 1 year ago

No, the point is not to test jupyter notebook itself. As you've seen the tutorial of nbdev, nbdev prompts developers to write code, (unit)tests & doc at once. What's missing in nbdev is the coverage report of its (unit)tests. The core code is converted into a package by nbdev_export. But the test part isn't in a package. nbdev_test runs all notebook. So tests in it are run as well. But no coverage report. That's the missing one if we execute modern SW development method like SCRUM / Agile, usually they recommends unit test coverage as one of their KPI. We could skip and proceed just for now if effort is too much :( Let's discuss what't the best to proceed!

IIUC, what pete88 did is that, he copy all tests from jupyter notebooks and run them under pytest? https://github.com/pete88b/decision_tree/blob/master/test_nbs.py#L1 If it is so, the implementation may not be so difficult?

I didn't get the point of nbval yet.

doyu commented 1 year ago

Here's something similar which I expected above, https://forums.fast.ai/t/nbdev-code-coverage-n-tests/73993/27

IIUC, nbprocess is the previous name of nbdev2 so that https://github.com/muellerzr/nbprocess-sandbox code is still 7 months old althoug you still need to update for nbdev(2).

doyu commented 1 year ago

It looks like if you can put the removed preprocs into somewhere in the latest proc.begin the rest should work?

lifeofborna commented 1 year ago

Update*

We managed to get muellers sandbox code to run on NBDEV2. Currently, it creates a new folder where it extracts the .ipynb files to python test file. You can then run !pytest/foldername and it will run tests.

One additional question do we wish to support unittest as well currently we have the pytest working. You can see the unit test in the below link at "Using unit test" section @doyu

https://github.com/muellerzr/nbprocess-sandbox/blob/main/01_export.ipynb

We will next implement it to our main GitHub repository.

doyu commented 1 year ago

Congrats! @lifeofborna & @nellatuulikki It was one of big barrier for setting up our CI / CD.

One additional question do we wish to support unittest as well currently we have the pytest working. You can see the unit test in the below link at "Using unit test" section @doyu

Either one of them would be OK. I have no idea which is better, pytest vs unit test. The outcome coverage data would be fed into codecov as well as C++ coverage @ArttuLe did. Probably Pytest is beter? You can quicly google it :D

We will next implement it to our main GitHub repository.

Cool!

Origami-Cloudless-AI / TinyMLaaS-2023-winter

Add unit test code coverage for *.ipynb files with nbdev #8