pymc-labs / CausalPy

A Python package for causal inference in quasi-experimental settings
https://causalpy.readthedocs.io
Apache License 2.0
914 stars 65 forks source link

Increase docstring coverage and add doctests #232

Closed jpreszler closed 1 year ago

jpreszler commented 1 year ago

This addresses issue #129:

  1. All methods in the api documentation have docstrings with working examples. Documentation was checked locally.
  2. Interrogate has 97% coverage - only things missing are file level docstrings on some test files.
  3. Interrogate failure threshold moved up to 85%

Potential future work:

Main items for review:

codecov[bot] commented 1 year ago

Codecov Report

Merging #232 (c80d78e) into main (d7a12cb) will increase coverage by 0.02%. The diff coverage is 85.71%.

@@            Coverage Diff             @@
##             main     #232      +/-   ##
==========================================
+ Coverage   73.86%   73.89%   +0.02%     
==========================================
  Files          19       19              
  Lines        1148     1149       +1     
==========================================
+ Hits          848      849       +1     
  Misses        300      300              
Files Changed Coverage Δ
causalpy/data/datasets.py 92.30% <ø> (ø)
causalpy/data/simulate_data.py 0.00% <ø> (ø)
causalpy/plot_utils.py 60.00% <ø> (ø)
causalpy/pymc_models.py 100.00% <ø> (ø)
causalpy/skl_experiments.py 66.86% <ø> (ø)
causalpy/skl_models.py 100.00% <ø> (ø)
causalpy/tests/conftest.py 100.00% <ø> (ø)
causalpy/tests/test_data_loading.py 100.00% <ø> (ø)
causalpy/tests/test_input_validation.py 100.00% <ø> (ø)
causalpy/tests/test_integration_pymc_examples.py 100.00% <ø> (ø)
... and 7 more
twiecki commented 1 year ago

Wow, this is amazing -- much appreciated @jpreszler!

jpreszler commented 1 year ago

I've cleaned up the example output, added doctest to the CI workflow and all examples pass locally. I've also added examples for the new IV model and experiment added in #213 .

This is ready for a new review @drbenvincent when you have a chance.

drbenvincent commented 1 year ago

Hi @jpreszler. Thanks for the updates. I'll try to carve out some time (I now have an 11 day old son now!) to review properly. But in the mean time I triggered the remote checks and it looks like we've got a failure. The test output looks like it could be just a missing import of statsmodels.

jpreszler commented 1 year ago

@drbenvincent , it looks like statsmodels wasn't being installed into the remote environment so I added it to the dependencies. That should fix the test failure.

Congratulations on the baby boy!

jpreszler commented 1 year ago

Looks like there's a little instability in the summary output that I'm looking into fixing.

jpreszler commented 1 year ago

I should have all the instability addressed, the doctests passed in the remote environment on my fork as well as locally.

jpreszler commented 1 year ago

@drbenvincent Thanks for the helpful comments. I've made the improvements so this is ready for another look when you have time.

jpreszler commented 1 year ago

I think there's definitely a bit of redundancy in the examples, and with doctest that adds a lot of time to running tests.

I've reduced the redundancy and moved all meaningful examples (like summary() calls) to a single example for each class and removed plot and a few other low value examples.

jpreszler commented 1 year ago

Those issues were not just local to you. I've fixed them and looked for other problems, but didn't spot much besides the same issue in WeightedSumFitter as in LinearRegression.

The tests have also passed on my fork after some small adjustments.

drbenvincent commented 1 year ago

Quick question as I've never used doctest before...

As far as I can tell doctests are run with pytest --doctest-modules causalpy/. At the moment I can only see these being triggered with the remote tests. Do you think it makes sense to add brief instructions to CONTRIBUTING.md either under "Pull request checklist" or "Building the documentation locally" to tell people that they should run the doctests locally (and how to do that) before a PR.

jpreszler commented 1 year ago

Good call. The same command can run all doctests locally, but I also added a make doctest command to the makefile. I've also added some details to the contributing doc about running all doctests or individual ones. This might be too much for the PR checklist.

This is my first venture into doctests, but the system is pretty straightforward. A good thing to note is that the +NUMBER option that is used by a number of the tests is a pytest extension of doctest. If you use doctest directly (not through pytest) these tests will fail.

drbenvincent commented 1 year ago

This is great. When running the doctests I just noticed that it produces new files, ancova_data.csv and regression_discontinuity.csv. We should ideally either not produce those files, or have some clean-up.

jpreszler commented 1 year ago

That's from a few of the doctests for the simulated data, I've skipped the lines that write out the csvs, but leaving the example of how to do so.