fastai / nbdev

Create delightful software with Jupyter Notebooks
https://nbdev.fast.ai/
Apache License 2.0
4.93k stars 488 forks source link

CI fails when multiple NBs write to same module #435

Closed GilesStrong closed 3 years ago

GilesStrong commented 3 years ago

Hi, I think I've found a bug during CI due to the # AUTOGENERATED! DO NOT EDIT! File to edit:... warning in modules being updated multiple times (or in an inconsistent order) when multiple notebooks write to the same module. This then gets caught as a change by nbdev_diff_nbs during CI:

Run if [ -n "$(nbdev_diff_nbs)" ]; then echo -e "$(nbdev_diff_nbs)" "!!! Detected difference between the notebooks and the library"; false; fi
  if [ -n "$(nbdev_diff_nbs)" ]; then echo -e "$(nbdev_diff_nbs)" "!!! Detected difference between the notebooks and the library"; false; fi
  shell: /bin/bash -e {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.6.13/x64
diff -ru /tmp/tmpcx95vd8w/classifier.py /tmp/tmpdcz9c5nv/classifier.py
--- /tmp/tmpcx95vd8w/classifier.py  2021-02-24 13:27:15.000000000 +0000
+++ /tmp/tmpdcz9c5nv/classifier.py  2021-02-24 13:29:12.000000000 +0000
@@ -1,4 +1,4 @@
-# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/02_feature_selection.ipynb (unless otherwise specified).
+# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/05_feature_interdependence.ipynb (unless otherwise specified).

 __all__ = ['optimise_classifier_rf', 'evaluate_classifier_preds', 'plot_rf_feat_importance', 'plot_partials',
            'filtered_feats', 'ignore_feats']
diff -ru /tmp/tmpcx95vd8w/data.py /tmp/tmpdcz9c5nv/data.py
--- /tmp/tmpcx95vd8w/data.py    2021-02-24 13:27:15.000000000 +0000
+++ /tmp/tmpdcz9c5nv/data.py    2021-02-24 13:29:13.000000000 +0000
@@ -1,4 +1,4 @@
-# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/00_data.ipynb (unless otherwise specified).
+# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/09_regression_data.ipynb (unless otherwise specified).

 __all__ = ['PATH', 'get_data', 'get_df', 'split_df', 'cont_feats', 'create_data', 'reg_targ_feats', 'reg_cont_feats',
            'preproc_reg_inputs', 'preproc_reg_targets', 'create_reg_data']
diff -ru /tmp/tmpcx95vd8w/inference.py /tmp/tmpdcz9c5nv/inference.py
--- /tmp/tmpcx95vd8w/inference.py   2021-02-24 13:27:15.000000000 +0000
+++ /tmp/tmpdcz9c5nv/inference.py   2021-02-24 13:29:13.000000000 +0000
@@ -1,4 +1,4 @@
-# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/12_dnn_on_systs.ipynb (unless otherwise specified).
+# AUTOGENERATED! DO NOT EDIT! File to edit: nbs/11_rf_on_systs.ipynb (unless otherwise specified).

 __all__ = ['get_syst_dfs', 'plot_systs', 'run_inference']
  !!! Detected difference between the notebooks and the library
Error: Process completed with exit code 1.

Strangely, though, I'm not able to reproduce this locally, where running nbdev_diff_nbs returns no error. The starting values of the AUTOGENERATED warnings are the same as those written in my modules, so at some point they get changed.

Which notebook gets listed in the warning also seems to change between modules; sometimes it is the first notebook to write to the module, and other times it is the last. However when I locally build the library the order is always opposite to when they are recreated during CI (note how nbs/02_feature_selection.ipynb and nbs/00_data.ipynb get changed for nbs/05_feature_interdependence.ipynb and nbs/09_regression_data.ipynb, i.e. swapping "first to write" for "last to write", but nbs/12_dnn_on_systs.ipynb gets swapped for nbs/11_rf_on_systs.ipynb, i.e. "last to write" for "first to write").

I assume that nbdev_diff_nbs works by re-exporting the notebooks to a seperate area and then comparing the resulting files with the existing ones to check for differences, so could the export order change somehow? Locally I'm using MacOS 10.15, but CI is on Ubuntu-latest. Thanks in advance for any help.

hamelsmu commented 3 years ago

👋🏽 @GilesStrong can you share a minimal reproducible example that would be helpful. Can you provide a public repo with just two notebooks that reproduce this problem? Thank you

GilesStrong commented 3 years ago

Sure, will do!

GilesStrong commented 3 years ago

Ok, I've setup a minimal repo that reproduces the problem: https://github.com/GilesStrong/nbdev_test The two notebooks in nbs both write to the same module, but the file listed in the AUTOGENERATED warning changes during CI due to nbdev_diff_nbs, resulting in an error (see https://github.com/GilesStrong/nbdev_test/runs/1986723916?check_suite_focus=true). This happens even if I run nbdev_build_lib on Ubuntu-16 and then run the CI on Ubuntu-16, so I don't think it's architecture dependent. Could there be a difference in the exporting between nbdev_diff_nbs and nbdev_build_lib?

GilesStrong commented 3 years ago

Update on this, I think the problem stems from notebooks that write to the same module not having sequential name IDs:

I'll begin looking into how nbdev_diff_nbs runs.