CausalAIBook / MetricsMLNotebooks

Notebooks for Applied Causal Inference Powered by ML and AI
MIT License
74 stars 37 forks source link

Transform R notebooks to Rmd; Add checks for R notebooks; Small mod to checks in Python notebooks #10

Closed vsyrgkanis closed 2 months ago

vsyrgkanis commented 2 months ago

This PR makes the following changes:

  1. Adds a check-and-transform github action, which: 1) automatically transforms the .irnb files to .Rmd files and commits the .Rmd files, 2) runs a linting check using lintr package, to check that the code adheres to code style, 3) transforms the .Rmd files to .R scripts and runs them to check for failures and errors, 4) reports any errors in linting or execution. This check will happen every Sunday. Moreover, it will be triggered for every PR or push to master. When triggered for a PR or push to master, only the checks for the folders that contain an altered .irnb file will be run. The rest of the folders will be skipped (for faster testing). During the scheduled execution all the checks for all folders will be ran.
  2. Corrects all mistakes to the .irnb notebooks in all folders so that they pass all the linting checks and the execution checks. This includes variable name changes, spaces after commas, long lines, and other formatting issues.
  3. Removed any non .irnb R script into a deprecated folder
  4. Moved the BERT R notebook to an "in-progress" folder, as it contains cells that are not yet finished and hence will fail the tests.
  5. Makes a small modification to the python notebook tests, so that there is only one test script (and not one for the scheduled occurrence and one for the PR trigger) and such that only folders with changed notebooks are executed during a PR or a push event.
OliverSchacht commented 2 months ago

Hi @vsyrgkanis thanks for all of these changes. I am in the process of reviewing them. Currently I checked all notebooks AC, CM and PM1 and PM2.

I checked all generated Rmds in branch gen-Rmd. The transformation from irnb to Rmd seems to work very well. I was able to execute and render them. The results seem to be consistent with everything before.

Also the liniting improves the coding style nicely. I found one notebook where the linting had unwanted results:

The linting issues I could correct in the .irnb files so that next time the Rmds are generated they are fine too. However, they might be linted back next time the lint action runs.

Maybe a small remark, if you render the Rmd to html, the title is "An R Markdown document converted from ..." which is maybe not super pretty but also not really an issue.

I will check all upcoming notebooks as soon as they finished rendering on my machine. Is there anything else you would like me to look at?

Best, Oliver

OliverSchacht commented 2 months ago

Sorry for the reverted commit. Got confused since execution failed both locally and on colab and only then saw that keras < 3.0.0 is a dependency here.

I was able to render all the Notebooks now and they look fine and are consistent with the irnb.

vsyrgkanis commented 2 months ago

Sorry for the reverted commit. Got confused since execution failed both locally and on colab and only then saw that keras < 3.0.0 is a dependency here.

I was able to render all the Notebooks now and they look fine and are consistent with the irnb.

Yes there was an issue with our code with Keras3 and beyond, so had to add this dependency. Otherwise we need to change a lot in our code, if we want to support keras3.

vsyrgkanis commented 2 months ago

Also @OliverSchacht @MartinSpindler one protocol is that approvers, just add comments and request to a PR and not directly make changes and pushes!