openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
725 stars 38 forks source link

[REVIEW]: causal-curve: A Python Causal Inference Package to Estimate Causal Dose-Response Curves #2523

Closed whedon closed 4 years ago

whedon commented 4 years ago

Submitting author: @ronikobrosly (Roni Kobrosly) Repository: https://github.com/ronikobrosly/causal-curve Version: v0.3.7 Editor: @oliviaguest Reviewer: @cmparlettpelleriti, @tomfaulkenberry, @alexjonesphd Archive: 10.5281/zenodo.3996524

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/b9387d0b29c387780698f16aa393237e"><img src="https://joss.theoj.org/papers/b9387d0b29c387780698f16aa393237e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/b9387d0b29c387780698f16aa393237e/status.svg)](https://joss.theoj.org/papers/b9387d0b29c387780698f16aa393237e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@cmparlettpelleriti & @tomfaulkenberry & @alexjonesphd, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @oliviaguest know.

โœจ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest โœจ

Review checklist for @cmparlettpelleriti

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @tomfaulkenberry

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @alexjonesphd

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 4 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @cmparlettpelleriti, @tomfaulkenberry, @alexjonesphd it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews ๐Ÿ˜ฟ

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 4 years ago
Reference check summary:

OK DOIs

- 10.1177/0962280209340213 is OK
- 10.2202/1557-4679.1181 is OK
- 10.1037/a0020761 is OK

MISSING DOIs

- None

INVALID DOIs

- None
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

oliviaguest commented 4 years ago

Hi everybody! @ronikobrosly, @cmparlettpelleriti, @tomfaulkenberry, @alexjonesphd โ€” welcome to where the review itself is going to be.

Any questions please ask me. Any very code-related issues use the repo of the code and open an issue and link to it from here. All other issues, feedback, questions for me or the author, please use this issue. Anything you need, check above first as there is a lot of into up there, but also please feel free to ask me. ๐Ÿ˜ธ

Thank you all again and I hope this is a nice experience for us all! ๐ŸŒผ ๐ŸŒธ

ronikobrosly commented 4 years ago

Just wanted to say that I appreciate everyone's help with this, especially during these times when everyone's bandwidth is limited.

tomfaulkenberry commented 4 years ago

I have completed my review, and I ran into no issues. The software paper clearly describes the audience and intended use of the software, and the software installed perfectly fine on my machine (2019 Macbook Air running Catalina, Python version 3.8 installed via Homebrew). All tutorials in the documentation were clear and I was able to recreate the examples on my own).

Overall, I can conclude my review now with a positive endorsement...this is a very nice piece of software and is perfectly within the scope of JOSS.

Thanks for asking me to review.

Tom Faulkenberry Tarleton State University

oliviaguest commented 4 years ago

@cmparlettpelleriti & @alexjonesphd โ˜บ๏ธ Absolutely no pressure, but when you get a chance can you give us a rough ETA for when you will take a look at this?

alexjonesphd commented 4 years ago

I am on it right now! :D

alexjonesphd commented 4 years ago

Hi all,

I installed the package fine via pip on a clean conda environment (3.7; Mac OS Mojave), got all the dependencies, but was seeing some import errors surrounding XGBoost, possibly due to OpenMP runtime. I've raised an issue on the code repo -[https://github.com/ronikobrosly/causal-curve/issues/15].

I've no experience of working with homebrew but my own poking around leads me to believe OpenMP which XGBoost needs may require install via that, which is a tricky step to get the package running.

ronikobrosly commented 4 years ago

Hi @alexjonesphd , thanks for raising this. The XGBoost package does require OpenMP: https://xgboost.readthedocs.io/en/latest/build.html#building-on-osx

I'm of two minds on this. On one hand XGBoost is a really nice ML package that is pretty widely used, but on the other hand maybe better to stick to a ML approach here that doesn't have that requirement you raised. I guess there are a couple options for this:

1) I could leave XGBoost in for now and specifically mention the OpenMP requirement and any other requirements for XGBoost in the documentation. Right now I don't mention these XGBoost requirements. 2) I could replace XGBoost with another package's gradient boosting implementation. Maybe I could use sklearn's implementation.

What do you think?

alexjonesphd commented 4 years ago

Hey @ronikobrosly, no problem, sorry to raise the issue, the package looks great and was excited to play with it!

My own take would be as follows, though feel free to disagree:

ronikobrosly commented 4 years ago

It's no problem at all to raise this @alexjonesphd . I hear you. I'll go ahead and try swapping out XGBoost for sklearn's boosting algorithm.

Is this completely blocking your installation and the rest of your review? If it is, I'll try to get to this ASAP (but it might be next weekend).

If you're able to install this and complete the review, is it okay if I hold off on revisions until I get the other reviews back? Just so I can tackle other comments all at once.

alexjonesphd commented 4 years ago

Afraid so @ronikobrosly - I got the XGBoost error on line 1, e.g.

from causal_curve import GPS throws the error. I don't know how much the package relies on XGBoost, I can tweak the code to prevent it importing and review the materials if required. Otherwise, happy to wait, and maybe @cmparlettpelleriti has other points!

ronikobrosly commented 4 years ago

Ok in that case no worries @alexjonesphd , I'll swap out XGBoost soon and will keep you updated ๐Ÿ‘

ronikobrosly commented 4 years ago

@alexjonesphd , I just merged a PR that removes XGBoost as a dependency. We're now using sklearn's GradientBoostingClassifier and GradientBoostingRegressor. It's a tad slower but the results look basically the same. It should also be easier to install this now ๐Ÿ‘ ๐Ÿ‘

Feel free to pip install the latest version (now it's 0.3.4)

alexjonesphd commented 4 years ago

That was fast @ronikobrosly! The no-rest-until-code-is-fixed bug, I know it well! Just clean installed it and importing GPS worked fine. I'll go through the examples tomorrow (in the UK!). Thanks for fixing so quickly, though sorry to hear its a little slower!

alexjonesphd commented 4 years ago

Thanks again for editing the codebase so quickly. I've now had a chance to work with the package and it seems great. The API is nicely modelled on sklearn so the instantiating a class and calling fit was second nature. All the examples worked well for me, I was able to reproduce them easily.

Perhaps my only comment would be expansion of the software paper or documentation to make it a little more accessible to those who don't know causal inference and perhaps why they would want to use the package. I'll freely admit I don't know fully what is being computed here but the codebase works well and is easy to use. It may be that this is beyond the scope of the journal and that's fine, but a little introduction to causal inference and why the package can help with that might make it more accessible. Otherwise, this is great!

ronikobrosly commented 4 years ago

Thanks for the kind words @alexjonesphd ! Sure thing, I can add a bit more text in the paper and/or documentation about causal inference and why they would want to use this package. I'll add this in after I hear back from @cmparlettpelleriti .

ronikobrosly commented 4 years ago

Hi @oliviaguest ๐Ÿ‘‹ , I havent heard back from @cmparlettpelleriti . What would you recommend?

chelseaparlett commented 4 years ago

hello! Sorry for the delay, I'm slowly working my way through the checklist ๐Ÿ™‚

Get Outlook for iOShttps://aka.ms/o0ukef


From: Roni Kobrosly notifications@github.com Sent: Monday, August 10, 2020 5:38:19 AM To: openjournals/joss-reviews joss-reviews@noreply.github.com Cc: Parlett, Chelsea (Student) parlett@chapman.edu; Mention mention@noreply.github.com Subject: Re: [openjournals/joss-reviews] [REVIEW]: causal-curve: A Python Causal Inference Package to Estimate Causal Dose-Response Curves (#2523)

External Message

Hi @oliviaguesthttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Foliviaguest&data=02%7C01%7Cparlett%40chapman.edu%7C3fa799b3ac1840e08afc08d83d2a43fa%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C637326599032965329&sdata=wJT5OI6UbUDsR%2FgfWfqxrP5OZch35fe8Xao5ywgCbh0%3D&reserved=0 ๐Ÿ‘‹ , I havent heard back from @cmparlettpelleritihttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcmparlettpelleriti&data=02%7C01%7Cparlett%40chapman.edu%7C3fa799b3ac1840e08afc08d83d2a43fa%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C637326599032965329&sdata=y9vEGcc8TbipzlMbkMqf6%2F9WwDtzsaHisQSNeNfSQrQ%3D&reserved=0 . What would you recommend?

โ€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenjournals%2Fjoss-reviews%2Fissues%2F2523%23issuecomment-671329175&data=02%7C01%7Cparlett%40chapman.edu%7C3fa799b3ac1840e08afc08d83d2a43fa%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C637326599032975326&sdata=YOqiWnzmy8xMeZNPNi3LsFSU09T%2F%2Fcm%2FxR7ii7%2B7JV8%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHQAMWRZYVEEQ6U3SOK24L3R77S3XANCNFSM4PI262RA&data=02%7C01%7Cparlett%40chapman.edu%7C3fa799b3ac1840e08afc08d83d2a43fa%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C637326599032975326&sdata=RWX6jijxwAYwL9H3kql2VntHNHAALWeGidRktyRTwe0%3D&reserved=0.

NOTE: This email originated from outside Chapmanโ€™s network. Do not click links or open attachments unless you recognize the sender and know content is safe.

oliviaguest commented 4 years ago

@ronikobrosly feel free to organise how you want to revise stuff any way you like. It can be either before or after @cmparlettpelleriti gets back to you. โœจ

ronikobrosly commented 4 years ago

Hi @alexjonesphd , I went ahead and revised the online documentation. There is now an introduction section and I tried to make it a gentle intro to experimentation, interpreting the causal curve, etc.

alexjonesphd commented 4 years ago

Thanks @ronikobrosly that looks great. Nice work! As before very happy to endorse and hope this gets some serious use by the Python community.

oliviaguest commented 4 years ago

@whedon re-invite @cmparlettpelleriti as reviewer

whedon commented 4 years ago

OK, the reviewer has been re-invited.

@cmparlettpelleriti please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

chelseaparlett commented 4 years ago

Thanks for your patience waiting for my review, I thought this was a beautifully organized paper and repository. As someone who doesn't do a ton of causal modeling, I still found it relatively easy to follow. You got me excited to do more causal modeling!

Some minor fixes:

In the main example file: Under Distributions of Scaled Test Scores the x axis labels might be incorrect (they still say blood lead, but I think theyโ€™re distributions of test scores) I got an error when running the code out of the box. I've attached a screenshot here. It's likely my fault, but after installing using pip, downloading the data and running the entire notebook, I got the error so I imagine others could too.

In Mediation_example.rst it should be โ€œdeveloped by Imai et al.โ€

I really enjoyed your comment about mirroring sklearn syntax, I think that's smart and will reduce cognitive load for people who are coming to this from other ML applications. My one larger suggestion is optional, but as someone who doesn't deeply know all the concepts you're implementing, linking out to other resources that more completely explain the math of the models would be so useful. If there are any open source books, or great blogs you can include here, I think that would help new learners a lot.

This is super cool, I very much enjoyed reviewing it!

Screen Shot 2020-08-16 at 5 47 55 PM
ronikobrosly commented 4 years ago

Thanks so much for the review and the kind words @cmparlettpelleriti ๐Ÿ˜„ ! Ok I'll look into your comments ASAP and let you know here how I fixed them.

ronikobrosly commented 4 years ago

@cmparlettpelleriti , quick question for you: Would you mind telling me what version of scipy you're using? Maybe the easiest way to find out would be to type pip freeze in the terminal within your virtual environment. Or you could open your notebook and try running the following code:

import scipy
scipy.__version__
ronikobrosly commented 4 years ago

The reason I'm asking @cmparlettpelleriti is I think that older versions of scipy give that error. I created a fresh virtual environment, pip installed requirements (including the latest scipy), ran notebook, and didn't see the error. I'm running version 1.5.2.

I'd rather not pin the versions of the dependencies in the requirements.txt file but maybe what I could do is specify in the documentation that scipy >= 1.5.2 is required? What do you think?

I should say I could also be completely wrong here and maybe it's another package causing this. This is my first guess though.

chelseaparlett commented 4 years ago

@ronikobrosly great suggestion! I was running 1.4.1, I upgraded to 1.5.2 and now got this error. Perhaps a different package that I need to update (I'm not on my normal work computer so things may be out of date; full list copied below)? Sorry about this, it seems like this is more user error than a problem with your code.

I totally agree that you don't want to specify a specific version, I think your idea is wonderful but I'll leave the final decision up to you, I think you know better what your users will need.

absl-py==0.9.0 alabaster==0.7.12 appdirs==1.4.4 appnope==0.1.0 arabic-reshaper==2.0.15 astor==0.8.1 astunparse==1.6.3 attrs==19.3.0 Babel==2.8.0 backcall==0.1.0 black==19.10b0 bleach==3.1.1 causal-curve==0.3.3 certifi==2019.11.28 cffi==1.14.0 chardet==3.0.4 click==7.1.2 coverage==5.2.1 cryptography==2.9.2 cycler==0.10.0 decorator==4.4.1 defusedxml==0.6.0 descartes==1.1.0 docutils==0.16 dukpy==0.2.2 entrypoints==0.3 esprima==4.0.1 et-xmlfile==1.0.1 freetype-py==2.1.0.post1 future==0.18.2 gast==0.3.3 gevent==20.4.0 gitdb==4.0.4 GitPython==3.1.1 glfw==1.11.0 greenlet==0.4.15 grpcio==1.27.2 h5py==2.10.0 idna==2.9 imageio==2.8.0 imageio-ffmpeg==0.4.1 imagesize==1.2.0 ipykernel==5.1.4 ipython==7.12.0 ipython-genutils==0.2.0 javascripthon==0.11 jdcal==1.4.1 jedi==0.16.0 Jinja2==2.11.1 joblib==0.14.1 json-tricks==3.15.2 json5==0.9.1 jsonschema==3.2.0 jupyter-client==6.0.0 jupyter-core==4.6.3 jupyterlab==1.2.6 jupyterlab-server==1.0.6 Keras==2.3.1 Keras-Applications==1.0.8 Keras-Preprocessing==1.1.0 kiwisolver==1.1.0 macropy3==1.1.0b2 Markdown==3.2.1 MarkupSafe==1.1.1 matplotlib==3.1.3 mistune==0.8.4 mizani==0.6.0 more-itertools==8.4.0 moviepy==1.0.2 msgpack==1.0.0 msgpack-numpy==0.4.5 nbconvert==5.6.1 nbformat==5.0.4 notebook==6.0.3 numexpr==2.7.1 numpy==1.18.1 numpydoc==1.1.0 opencv-python==4.2.0.34 openpyxl==3.0.3 packaging==20.4 palettable==3.3.0 pandas==1.0.1 pandocfilters==1.4.2 parso==0.6.1 pathspec==0.8.0 patsy==0.5.1 pexpect==4.8.0 pickleshare==0.7.5 Pillow==7.1.2 plotnine==0.6.0 pluggy==0.13.1 proglog==0.1.9 progressbar2==3.51.4 prometheus-client==0.7.1 prompt-toolkit==3.0.3 protobuf==3.11.3 psutil==5.7.0 PsychoPy==2020.1.2 ptyprocess==0.6.0 py==1.9.0 pycparser==2.20 pygam==0.8.0 pyglet==1.5.4 Pygments==2.5.2 pyobjc-core==6.2 pyobjc-framework-Cocoa==6.2 pyobjc-framework-Quartz==6.2 PyOpenGL==3.1.5 pyOpenSSL==19.1.0 pyosf==1.0.5 pyparsing==2.4.6 PyQt5==5.14.2 PyQt5-sip==12.7.2 pyrsistent==0.15.7 pyserial==3.4 pytest==5.4.3 python-bidi==0.4.2 python-dateutil==2.8.1 python-gitlab==2.2.0 python-utils==2.4.0 pytz==2019.3 PyYAML==5.3 pyzmq==18.1.1 questplus==2019.4 regex==2020.7.14 requests==2.23.0 scikit-learn==0.22.1 scipy==1.5.2 seaborn==0.10.1 Send2Trash==1.5.0 six==1.14.0 smmap==3.0.2 snowballstemmer==2.0.0 sounddevice==0.3.15 SoundFile==0.10.3.post1 Sphinx==3.1.2 sphinxcontrib-applehelp==1.0.2 sphinxcontrib-devhelp==1.0.2 sphinxcontrib-htmlhelp==1.0.3 sphinxcontrib-jsmath==1.0.1 sphinxcontrib-qthelp==1.0.3 sphinxcontrib-serializinghtml==1.1.4 statsmodels==0.11.1 tables==3.6.1 tensorboard==1.12.2 tensorflow==1.12.0 termcolor==1.1.0 terminado==0.8.3 testpath==0.4.4 toml==0.10.1 tornado==6.0.3 tqdm==4.45.0 traitlets==4.3.3 typed-ast==1.4.1 urllib3==1.25.9 wcwidth==0.1.8 webencodings==0.5.1 Werkzeug==1.0.0 wxPython==4.1.0 xarray==0.15.1 xgboost==1.1.1 xlrd==1.2.0

Screen Shot 2020-08-18 at 10 39 05 AM
ronikobrosly commented 4 years ago

@cmparlettpelleriti whoops I forgot to mention this yesterday: I actually got this Expect x to not have duplicates too yesterday. This is because of a real bug in the code (this line needs to be fixed: https://github.com/ronikobrosly/causal-curve/blob/93f9219b41e3174a14b96a0ebb271da730b92c56/causal_curve/mediation.py#L520). I fixed it locally and it seems to work but I haven't pushed anything to the repo yet. I'll let you know when I finish up the rest of the changes you asked for (should be soon).

So basically, you uncovered a bug ๐Ÿ™ƒ thanks!

chelseaparlett commented 4 years ago

Thanks @ronikobrosly, glad to hear it was something you already knew about!! :)

ronikobrosly commented 4 years ago

Ok @cmparlettpelleriti , I just pushed those changes. If you upgrade your version of causal-curve to version 0.3.7, the notebook should work for you all the way through. Could you verify that for you me?

Also, I made the misc changes you provided:

chelseaparlett commented 4 years ago

@ronikobrosly Everything looks wonderful, and the notebook now works perfectly after updating to 0.3.7 !

ronikobrosly commented 4 years ago

Great! Thanks helping improve this @cmparlettpelleriti .

Hi @oliviaguest , I believe this means I've addressed the comments of these three reviewers. Is there anything I should do next?

oliviaguest commented 4 years ago

@whedon check references

whedon commented 4 years ago
Reference check summary:

OK DOIs

- 10.1177/0962280209340213 is OK
- 10.2202/1557-4679.1181 is OK
- 10.1037/a0020761 is OK

MISSING DOIs

- None

INVALID DOIs

- None
oliviaguest commented 4 years ago

@ronikobrosly yes, please deposit the code on zenodo or figshare and post the DOI here.

oliviaguest commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

ronikobrosly commented 4 years ago

Sure thing @oliviaguest , here you go: https://zenodo.org/badge/256017107.svg.

That's DOI 10.5281/zenodo.3996524

oliviaguest commented 4 years ago

@whedon set 10.5281/zenodo.3996524 as archive

whedon commented 4 years ago

OK. 10.5281/zenodo.3996524 is the archive.

oliviaguest commented 4 years ago

@ronikobrosly can you make the title of the repo on zenodo the same as the title here, please?

oliviaguest commented 4 years ago

@whedon set v0.3.2 as version

whedon commented 4 years ago

OK. v0.3.2 is the version.

oliviaguest commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

oliviaguest commented 4 years ago

@whedon accept

whedon commented 4 years ago
Attempting dry run of processing paper acceptance...