[REVIEW]: ChiRP: Chinese Restaurant Process Mixtures for Regression and Clustering

whedon commented 5 years ago

Submitting author: @stablemarkets (Arman Oganisian) Repository: https://github.com/stablemarkets/ChiRP Version: 1.0.0 Editor: @pjotrp Reviewer: @agisga Archive: 10.5281/zenodo.2591600

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/3b83a0a3f1220f97657a1075b78e480a"><img src="http://joss.theoj.org/papers/3b83a0a3f1220f97657a1075b78e480a/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/3b83a0a3f1220f97657a1075b78e480a/status.svg)](http://joss.theoj.org/papers/3b83a0a3f1220f97657a1075b78e480a)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@agisga, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @pjotrp know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @agisga

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: Does the release version given match the GitHub release (1.0.0)?
[x] Authorship: Has the submitting author (@stablemarkets) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

whedon commented 5 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @agisga it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

pjotrp commented 5 years ago

@stablemarkets, we are starting review in this issue tracker. To expedite the review process do you mind going through above list of check boxes and make sure they can be ticked (you can't tick them). Also check the PDF output carefully. Ping us here when you are done.

stablemarkets commented 5 years ago

Hi @pjotrp. Thank you for working on this.

I've provided responses to several items in the check list below:

General checks

[ ] Repository: Source code is located in the R/ subdirectory.
[ ] License: MIT license is here: LICENSE.md
[ ] Version: See DESCRIPTION file where we list "Version: 1.0.0".
[ ] Authorship: I am the sole author of the code base.

Functionality

[ ] Installation: Installation is documented in the README.md file of the repo. Travis CI is used to confirm builds on Linux and OSX.
[ ] Functionality: This can be partially evaluated by replicating some of the examples in the package's companion site.
[ ] Performance: I don't believe I've made any performance claims.

Documentation

[ ] A statement of need: I feel this is addressed in the paper.md file but I will leave to reviewers/editors to judge.
[ ] Installation instructions: Dependency packages are listed in in DESCPRIPTION. Upon installing ChiRP, R automatically checks for these dependencies and installs if necessary
[ ] Example usage: Several usage examples are given on the package's companion site
[ ] Functionality documentation: See "Documentation and Examples" section of the repo README.md for instructions on how to access API help files within R.
[ ] Automated tests: The testthat()R package is used for automated testing. This is integrated with Travis CI so that tests are run upon every commit. Coveralls is used to track the coverage of these automated tests. Coverage is currently 92%.
[ ] Community guidelines: Issue reporting and contact info are given in the "Reporting Issues" and "Contact" sections of the repo README.md

pjotrp commented 5 years ago

Thank you. @agisga feel free to start review. Guide lines are at the top of this page.

agisga commented 5 years ago

Hi!

I have reviewed this submission. Generally it is pretty good. My review is given below.

Summary:

This submission is an R package that implements several Bayesian models from the family of Chinese Restaurant Process (CRP) mixtures. The methods implemented in ChiRP can be used for regression, binary classification, clustering, and related inference tasks. These are very common data analysis tasks, which one commonly encounters in many scientific disciplines. However, the author seems to be motivated specifically by biomedical applications. The advantages of the ChiRP models against alternative methods, include their nonparametric nature, the ability to return interval estimates, and to obtain posterior distributions for predictions as well as for cluster assignments on training and test data. In particular, unlike many other clustering algorithms, here the number of clusters is not specified a priori, and is determined automatically from the data (although an initial number of clusters needs to be given to kick off the MCMC sampler). The package website including model description and examples is really informative and great (in addition, the R Shiny tutorial based on Dirichlet Processes is just awesome). The R code is well-documented, and includes automated tests too. Code quality seems to be good (based on a very quick look through each of the source code files). I'm not an expert on the methodology but, running and tweaking the provided examples, the package code seems to be doing what it's supposed to do. Some minor issues are listed below.

Potential Issues:

One unchecked check box: "Do all archival references that should have a DOI list one (e.g., papers, datasets, software)" ~~> No, the majority of cited papers don't have DOI specified.
There are Acknowledgements, mentioning several individuals, at https://stablemarkets.github.io/ChiRPsite/index.html but not in the paper.
I think that many people who have never heard of the "Chinese Restaurant Process (CRP)" may find this R package very useful, especially for clustering. Therefore, it would be good to cite a canonical reference for Chinese Restaurant Process (CRP) mixtures in the very beginning of the paper for those readers unfamiliar with this methodology.
There are community guidelines for seeking support and reporting issues, but there is no "contribution" information, which is point 1) of the "Community Guidelines" check box.
I get three warnings from testthat when running the automated tests. Not really an issue, but it probably would be better to not have any warning. (test_ndp.r:23: warning: Test Init K Errors, test_pdp.r:25: warning: Test Init K Errors, and test_zdp.r:28: warning: Test Init K Errors).
Paper: "ARCHIVE" link broken in the PDF (not sure if that's an issue).
Typos:
- File name ./R/helper_funtions_pdp.R ~~> might want to change to ./R/helper_functions_pdp.R ("functions" spelled with a "c")
- There are a couple of minor spelling mistakes on the project webpage, which can be found with a simple spell check.
- Some of the math notation is not displayed correctly in the bullet point list following "Package implementation details:" at https://stablemarkets.github.io/ChiRPsite/modeldesc.html

None of these are serious issues.

pjotrp commented 5 years ago

Thank you @agisga! @stablemarkets if you can address these points quickly we can publish!

stablemarkets commented 5 years ago

Hi @pjotrp and @agisga. Thank you for your thorough review! Thanks especially for reviewing the companion web site - extremely helpful.

The package repository has been updated with edits that address your feedback (see below for details). I think the paper.md file will need to be recompiled for the edits to be visible?

I added DOIs to all but two references in paper.bib. I could not find DOIs for two of the papers. One is an ArXiv paper (currently unpublished) so I think it has no DOI yet. For both papers, URLs are provided. Hope this is okay.
Thanks! I've added an acknowledgements sections to the paper mentioning these individuals.
I've added two citations to seminal CRP papers - Ferguson 1973, Blackwell 1973 - to the first paragraph when introducing CRPs.
Contribution instructions have been added to both the ChiRP repository and the companion site. Users can contribute by proposing modifications to the base code or by adding usage examples to the companion site.
testthat warnings: Thanks for catching this! I looked into it. The warning is triggered when someone specifies and init_k with a length greater than 1. In error_checks(), I run a single check that returns an error if init_k is not numeric, or if init_k is not a scalar (e.g. length 2), or if init_k<0. If init_k is of length 2, an error is indeed returned so the test is passed. However, evaluating init_k<0, yields a length 2 logical vector while the other two conditions in the check yield length 1 logicals. So it shoots a warning saying it will only use the first element of init_k<0 to evaluate the entire check. I fixed this by doing a nested check: first making sure init_k is numeric and length 1. Then, if that is passed, check to make sure if init_k>0. If it's not, throw an error message at the user. This passes the automated test without shooting an additional warning since init_k<0is evaluated only when we are sure init_k is a scalar.
I'm not sure but the ARCHIVE link is probably not working since I haven't archived the paper yet. According to review process it seems like we only archive after the review process is completed?
Thanks for catching these typos: I've renamed helper/functions_pdp.r, fixed the notation rendering on the site, as well as spell-checked each of the three pages on the companion site.

pjotrp commented 5 years ago

Excellent. @agisga can you run through those points and confirm you are happy?

agisga commented 5 years ago

@pjotrp

Everything looks good to me!

I don't think there is a new article proof. So, can't check that. But I have looked at all new commits in the ChiRP repo, and also confirmed that the automated tests now run without warnings.

I don't know how strict JOSS is about having DOIs for every article reference, but in my opinion it's fine to have those two DOIs missing, especially because URLs are provided. Thus, I have checked the last checkbox. I leave the final decision up to you.

pjotrp commented 5 years ago

Thank you for the thorough review @agisga. @stablemarkets: to finalize your submission and accept your paper in JOSS, we need two things. First, can you confirm that all references in your bibliography have a DOI in the bibliography (if one exists).

Second, we need you to deposit a copy of your software repository (including any revisions made during the JOSS review process) with a data-archiving service.

To do so:

Create a GitHub release of the current version of your software repository
Deposit that release with Zenodo, figshare, or a similar DOI issuer.
Post a comment here with the DOI for the release.

pjotrp commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

stablemarkets commented 5 years ago

Hi @pjotrp,

I confirm all references in my bibliography have a DOI where available. I've deposited my repository (including revisions discussed here) to Zenodo. Here is the DOI badge:

pjotrp commented 5 years ago

@whedon set 10.5281/zenodo.2591600 as archive

whedon commented 5 years ago

OK. 10.5281/zenodo.2591600 is the archive.

pjotrp commented 5 years ago

@whedon accept

whedon commented 5 years ago

Attempting dry run of processing paper acceptance...

whedon commented 5 years ago

PDF failed to compile for issue #1287 with the following error:

/app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:in block in find': No such file or directory - tmp/1287 (Errno::ENOENT) from /app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:incollect!' from /app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:in find' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-01ece1d1d135/lib/whedon/processor.rb:57:infind_paper_paths' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-01ece1d1d135/bin/whedon:73:in compile' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:inrun' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in invoke_command' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:indispatch' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in start' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-01ece1d1d135/bin/whedon:116:in<top (required)>' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in load' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in

'

whedon commented 5 years ago


OK DOIs

- 10.1007/978-3-319-18968-0 is OK
- 10.1080/10618600.2000.10474879 is OK
- 10.1016/j.jmp.2011.08.004 is OK
- 10.2307/2334940 is OK
- 10.1080/10618600.2012.735624 is OK
- 10.1111/1467-9868.00265 is OK
- 10.1214/aos/1176342360 is OK
- 10.1214/aos/1176342372 is OK

MISSING DOIs

- https://doi.org/10.1145/1015330.1015439 may be missing for title: Dirichlet process mixtures of generalized linear models
- https://doi.org/10.2172/1212177 may be missing for title: A Bayesian Nonparametric Model for Zero-Inflated Outcomes: Prediction, Clustering, and Causal Estimation

INVALID DOIs

- None

pjotrp commented 5 years ago

@arfon can you check why the PDF is failing?

arfon commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

arfon commented 5 years ago

@whedon check references

whedon commented 5 years ago

Attempting to check references...

whedon commented 5 years ago


OK DOIs

- 10.1007/978-3-319-18968-0 is OK
- 10.1080/10618600.2000.10474879 is OK
- 10.1016/j.jmp.2011.08.004 is OK
- 10.2307/2334940 is OK
- 10.1080/10618600.2012.735624 is OK
- 10.1111/1467-9868.00265 is OK
- 10.1214/aos/1176342360 is OK
- 10.1214/aos/1176342372 is OK

MISSING DOIs

- None

INVALID DOIs

- None

pjotrp commented 5 years ago

@whedon accept

whedon commented 5 years ago

Attempting dry run of processing paper acceptance...

whedon commented 5 years ago


OK DOIs

- 10.1007/978-3-319-18968-0 is OK
- 10.1080/10618600.2000.10474879 is OK
- 10.1016/j.jmp.2011.08.004 is OK
- 10.2307/2334940 is OK
- 10.1080/10618600.2012.735624 is OK
- 10.1111/1467-9868.00265 is OK
- 10.1214/aos/1176342360 is OK
- 10.1214/aos/1176342372 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon commented 5 years ago

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/575

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/575, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

pjotrp commented 5 years ago

ping eic @openjournals/joss-eics

arfon commented 5 years ago

@whedon accept deposit=true

whedon commented 5 years ago

Doing it live! Attempting automated processing of paper acceptance...

whedon commented 5 years ago

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/576
Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01287
If everything looks good, then close this review issue.
Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

arfon commented 5 years ago

@agisga - many thanks for your review and to @pjotrp for editing this submission ✨

@stablemarkets - your paper is now accepted into JOSS :zap::rocket::boom:

whedon commented 5 years ago

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01287/status.svg)](https://doi.org/10.21105/joss.01287)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01287">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01287/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01287/status.svg
   :target: https://doi.org/10.21105/joss.01287

This is how it will look in your documentation:

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Volunteering to review for us sometime in the future. You can add your name to the reviewer list here: http://joss.theoj.org/reviewer-signup.html
Making a small donation to support our running costs here: https://numfocus.salsalabs.org/donate-to-joss

openjournals / joss-reviews