openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
720 stars 38 forks source link

[REVIEW]: haldensify: Highly adaptive lasso conditional density estimation in R #4522

Closed editorialbot closed 2 years ago

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@nhejazi<!--end-author-handle-- (Nima Hejazi) Repository: https://github.com/nhejazi/haldensify Branch with paper.md (empty if default branch): Version: v0.2.5 Editor: !--editor-->@majensen<!--end-editor-- Reviewers: @turgeonmaxime, @adibender Archive: 10.5281/zenodo.7089147

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/e54c2bf1daca38bdcda505be275ecb65"><img src="https://joss.theoj.org/papers/e54c2bf1daca38bdcda505be275ecb65/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/e54c2bf1daca38bdcda505be275ecb65/status.svg)](https://joss.theoj.org/papers/e54c2bf1daca38bdcda505be275ecb65)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@turgeonmaxime & @adibender, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @majensen know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @turgeonmaxime

📝 Checklist for @adibender

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 years ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.09 s (831.3 files/s, 119663.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                            28            676             97           4012
R                               22            270           1051           1601
Markdown                         7            170              0            759
TeX                              1             44              0            482
CSS                              3             98             52            442
JavaScript                       5             65             37            277
Rmd                              2             90            379             99
YAML                             5             15              4             95
XML                              1              0              0             84
make                             1              9              0             20
SVG                              1              0              1             11
-------------------------------------------------------------------------------
SUM:                            76           1437           1621           7882
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 2 years ago

Wordcount for paper.md is 2123

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/978-3-319-65304-4_14 is OK
- 10.1111/j.1541-0420.2011.01685.x is OK
- 10.2202/1557-4679.1356 is OK
- 10.1111/rssb.12362 is OK
- 10.1007/978-3-319-65304-4_6 is OK
- 10.1515/ijb-2015-0097 is OK
- 10.5281/zenodo.3558313 is OK
- 10.21105/joss.02526 is OK
- 10.1109/dsaa.2016.93 is OK
- 10.1515/jci-2014-0022 is OK
- 10.1111/j.1541-0420.2005.00377.x is OK
- 10.1162/neco.2008.10-07-628 is OK
- 10.1016/s0167-9473(00)00046-3 is OK
- 10.1007/978-1-4471-2097-1_162 is OK
- 10.21105/joss.00512 is OK
- 10.2202/1544-6115.1036 is OK
- 10.1016/j.stamet.2005.02.003 is OK
- 10.1515/ijb-2015-0097 is OK
- 10.1111/biom.13375 is OK
- 10.1002/sim.5907 is OK
- 10.1017/cbo9780511810725.016 is OK
- 10.3150/bj/1093265631 is OK
- 10.1093/biomet/85.3.619 is OK
- 10.1007/bf00117832 is OK
- 10.5281/zenodo.4070042 is OK
- 10.21105/joss.02447 is OK
- 10.2202/1557-4679.1043 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

turgeonmaxime commented 2 years ago

Review checklist for @turgeonmaxime

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

adibender commented 2 years ago

Review checklist for @adibender

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

nhejazi commented 2 years ago

Thanks very much for agreeing to review this submission, @adibender and @turgeonmaxime. No rush at all, but don’t hesitate to let me know if I can clarify anything about the package or paper

turgeonmaxime commented 2 years ago

@editorialbot commands

editorialbot commented 2 years ago

Hello @turgeonmaxime, here are the things you can ask me to do:


# List all available commands
@editorialbot commands

# Get a list of all editors's GitHub handles
@editorialbot list editors

# Check the references of the paper for missing DOIs
@editorialbot check references

# Perform checks on the repository
@editorialbot check repository

# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist

# Set a value for branch
@editorialbot set joss-paper as branch

# Generates the pdf paper
@editorialbot generate pdf

# Get a link to the complete list of reviewers
@editorialbot list reviewers
turgeonmaxime commented 2 years ago

@majensen I finished reviewing this submission, I was just wondering if there's anything else I need to do (I don't see any command that says "completed review" or something of the sort).

majensen commented 2 years ago

Thanks very much @turgeonmaxime - that's all we need at this point.

majensen commented 2 years ago

@adibender how is your review coming along?

majensen commented 2 years ago

@adibender are you still able to help us with this review? Thanks

adibender commented 2 years ago

@majensen Hi, sorry. Yes, will finish the review until 12th of august.

adibender commented 2 years ago

I've left an issue regarding the Example Usage vignette here: https://github.com/nhejazi/haldensify/issues/37 Otherwise good to go from my side

majensen commented 2 years ago

@adibender - can you indicate if @nhejazi 's addition at nhejazi/haldensify#38 meets your approval? If so, we can move towards acceptance. thanks!

nhejazi commented 2 years ago

Hi @adibender — just another ping here about the outstanding issue for this open review. If you could comment on whether you’d like to see further edits or otherwise mark the issue as resolved, that would help us move forward.

adibender commented 2 years ago

Hi, @majensen I've left some additional optional comments. Up to @nhejazi Good to go from my side.

nhejazi commented 2 years ago

Thanks @adibender, I've replied to your comments and closed the relevant issue. I'll merge the PR related to this review shortly. @majensen, once that PR is merged, just let me know about next steps.

nhejazi commented 2 years ago

Update that the PR (https://github.com/nhejazi/haldensify/pull/38) has been merged after successfully passing CI checks.

majensen commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

majensen commented 2 years ago

Thanks @nhejazi - I will have an editorial look at the paper itself, which may result in some comments/fixes. Will try to accomplish by next week. @adibender @turgeonmaxime your work is very much appreciated!

majensen commented 2 years ago

@nhejazi - I think the paper as such is very well written. What I am struggling with is, will a prospective user of the package be likely to read it? This may not be a big concern, given the popularity of the package. In the interest of making a better paper, though, I want to suggest making the internal paragraphs (i.e., not Statement of Need or Scope) 30-50% of their current length. This would force some choices about what previous work you want to review, and how much mathematical and statistical commentary you want to make.

Mathematical statistics is very far from my expertise. However, I can see that there is a flow in the presentation that starts with a problem to solve and leads logically to the solution you have created. Here is my precis of the paper (think of me as an NLP algorithm, not a statistician:)

Conditional density estimation is a fundamental task across a number of areas of statistical analysis practice. It is frequently involved in the handling of nuisance parameters. Typically, parametric estimation of conditional densities for these parameters leads to poor quality model input, affecting quality of downstream estimation of the interesting parameters. Therefore, it is desirable to have a non-parametric method of conditional density estimation that can be rationally obtained using the data at hand. However, there are few implemented options for analysts to use.

haldensify is an implementation of such an estimator, with desirable theoretical properties that make it highly suitable, e.g., for problems involving estimation of causal effect parameters in certain continuous treatment problems. In particular, effects of (time?) shifts in vaccine delivery on vaccine efficacy are amenable to estimation with our method.

The non-parametric method of conditional density estimation of van der Laan and colleagues is the foundation of our implementation. Briefly, the method converts the density problem into a repeated-measures, hazard probability problem. The present work’s key advance is implementing this method using the highly adaptive lasso algorithm to perform the regression step on the transformed problem. The HAL algorithm has statistical properties that also enable us to rationally implement inverse probability weighted estimators based on the density estimation and (in our broad example class of problems) estimate causal effects.

The haldensify package provides haldensify, the conditional density estimation algorithm. It also implements ipw_shift, which uses undersmoothing to select an appropriate generalized propensity score estimator (produced by haldensify) to obtain an inverse probability weighted estimator of causal effect with desirable properties. ipw_shift is another approach to the problem solved by txshift, which implements a so-called doubly robust estimator of the GPS.

This does a couple of things: it focuses on the implementation, leaves the details of the mathematical justifications and properties to references, and also puts your previous practical application at the beginning.

If this reduction is close to a correct interpretation, maybe it can help form a basis for a shorter paper. Let me know what you think. MAJ

nhejazi commented 2 years ago

Thanks for weighing in on this and for the summary provided @majensen — I think what you’ve come up with is on point. I agree also that the readability — and brevity — of the paper is very important, much more so than the inclusion of possibly extraneous details. I’ll go through the paper and make cuts where I can later this week.

nhejazi commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

nhejazi commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

nhejazi commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

nhejazi commented 2 years ago

Hi @majensen, an update to note that I've revised the paper draft to bring it more into line with your recommendations, which I agree with even more after having re-read the paper myself. My take is that the previous version of the paper included too much detail in the sections on methodology (on conditional density estimation and inverse probability weighted estimation); these same two sections have now been rewritten, removing much of the technical detail, and combined into a single section. Note that there now is some intentional overlap/redundancy that I hope reinforces concepts and the roles of particular user-facing functions within the haldensify package for the new reader. When you have a chance, please review and let me know what you think.

majensen commented 2 years ago

Thanks @nhejazi - I will review by end of the week

majensen commented 2 years ago

@nhejazi I really appreciate your work on this. I think it reads very well and highlights the advance this algorithm makes within the context of the theory, whereas before I felt it was somewhat buried. It should now lead interested reader on to downloading and trying the software!

The final task is for you to create an open archive of the code repository, using Zenodo, FigShare or similar open service. A short tutorial on this is in this comment: https://github.com/openjournals/joss-reviews/issues/1839#issuecomment-560048729

Please respond back to this thread with the DOI of your archive. I will then add some metadata here and use the ancient spell summoning the Associate Editor in Chief!

nhejazi commented 2 years ago

Thanks for confirming @majensen and glad to hear the updated paper reads better (I agree that the presentation of the key points is much improved in this draft).

Regarding the repository archive, the package/repository already had already been archived on Zenodo with a perpetual DOI (for all versions) of https://doi.org/10.5281/zenodo.3698329, so I've just created a new release of the package (https://github.com/nhejazi/haldensify/releases/tag/v0.2.5) and archived that updated version at https://doi.org/10.5281/zenodo.7089147. I think this should take care of the requirements for archiving the repository with a DOI.

majensen commented 2 years ago

@editorialbot set doi.org/10.5281/zenodo.7089147 as archive

editorialbot commented 2 years ago

Done! Archive is now doi.org/10.5281/zenodo.7089147

majensen commented 2 years ago

@editorialbot set v0.2.5 as version

editorialbot commented 2 years ago

Done! version is now v0.2.5

majensen commented 2 years ago

@editorialbot recommend-accept

editorialbot commented 2 years ago
Attempting dry run of processing paper acceptance...
editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/978-3-319-65304-4_14 is OK
- 10.1111/j.1541-0420.2011.01685.x is OK
- 10.2202/1557-4679.1356 is OK
- 10.1111/rssb.12362 is OK
- 10.1007/978-3-319-65304-4_6 is OK
- 10.1515/ijb-2015-0097 is OK
- 10.5281/zenodo.3558313 is OK
- 10.21105/joss.02526 is OK
- 10.1109/dsaa.2016.93 is OK
- 10.1515/jci-2014-0022 is OK
- 10.1111/j.1541-0420.2005.00377.x is OK
- 10.1162/neco.2008.10-07-628 is OK
- 10.1016/s0167-9473(00)00046-3 is OK
- 10.1007/978-1-4471-2097-1_162 is OK
- 10.21105/joss.00512 is OK
- 10.2202/1544-6115.1036 is OK
- 10.1016/j.stamet.2005.02.003 is OK
- 10.1515/ijb-2015-0097 is OK
- 10.1111/biom.13375 is OK
- 10.1002/sim.5907 is OK
- 10.1017/cbo9780511810725.016 is OK
- 10.3150/bj/1093265631 is OK
- 10.1093/biomet/85.3.619 is OK
- 10.1007/bf00117832 is OK
- 10.5281/zenodo.4070042 is OK
- 10.21105/joss.02447 is OK
- 10.2202/1557-4679.1043 is OK

MISSING DOIs

- 10.1111/biom.13719 may be a valid DOI for title: Nonparametric inverse probability weighted estimators based on the highly adaptive lasso
- 10.1515/ijb-2019-0092 may be a valid DOI for title: Efficient estimation of pathwise differentiable target parameters with the undersmoothed highly adaptive lasso

INVALID DOIs

- None
editorialbot commented 2 years ago

:warning: Error prepararing paper acceptance. The generated XML metadata file is invalid.

ID ref-vdl2017generally already defined
majensen commented 2 years ago

@nhejazi Can you review the above output from the paper build - I believe there are two references you can add ("Missing DOIs"), and it looks like there is a duplicate reference that could be deleted. Thanks!

nhejazi commented 2 years ago

@majensen --- thanks for catching these necessary edits. I've removed the duplicated bibliography entry and added the two missing DOIs (these manuscripts both completed their respective publication processes sometime after this software paper was submitted, so the DOIs were not available back then).

majensen commented 2 years ago

@editorialbot recommend-accept

editorialbot commented 2 years ago
Attempting dry run of processing paper acceptance...
editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/978-3-319-65304-4_14 is OK
- 10.1111/j.1541-0420.2011.01685.x is OK
- 10.2202/1557-4679.1356 is OK
- 10.1111/rssb.12362 is OK
- 10.1007/978-3-319-65304-4_6 is OK
- 10.1515/ijb-2015-0097 is OK
- 10.5281/zenodo.3558313 is OK
- 10.21105/joss.02526 is OK
- 10.1109/dsaa.2016.93 is OK
- 10.1515/jci-2014-0022 is OK
- 10.1111/biom.13719 is OK
- 10.1111/j.1541-0420.2005.00377.x is OK
- 10.1162/neco.2008.10-07-628 is OK
- 10.1016/s0167-9473(00)00046-3 is OK
- 10.1007/978-1-4471-2097-1_162 is OK
- 10.21105/joss.00512 is OK
- 10.2202/1544-6115.1036 is OK
- 10.1016/j.stamet.2005.02.003 is OK
- 10.1515/ijb-2019-0092 is OK
- 10.1111/biom.13375 is OK
- 10.1002/sim.5907 is OK
- 10.1017/cbo9780511810725.016 is OK
- 10.3150/bj/1093265631 is OK
- 10.1093/biomet/85.3.619 is OK
- 10.1007/bf00117832 is OK
- 10.5281/zenodo.4070042 is OK
- 10.21105/joss.02447 is OK
- 10.2202/1557-4679.1043 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 2 years ago

:wave: @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/3547, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept