whedon commented 5 years ago

Submitting author: @kmichael08 (Michał Kuźba) Repository: https://github.com/ModelOriented/pyCeterisParibus Version: v0.5.2 Editor: @katyhuff Reviewer: @janfreyberg, @justinshenk Archive: 10.5281/zenodo.2667756

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/aad9a21c61c01adebe11bc5bc1ceca92"><img src="http://joss.theoj.org/papers/aad9a21c61c01adebe11bc5bc1ceca92/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/aad9a21c61c01adebe11bc5bc1ceca92/status.svg)](http://joss.theoj.org/papers/aad9a21c61c01adebe11bc5bc1ceca92)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@janfreyberg & @justinshenk, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @katyhuff know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @janfreyberg

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v0.5.2
[x] Authorship: Has the submitting author (@kmichael08) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @justinshenk

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v0.5.2
[x] Authorship: Has the submitting author (@kmichael08) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

whedon commented 5 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @janfreyberg, it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

JustinShenk commented 5 years ago

Overview: pyCeterisParibus is a library for explaining machine learning models with ceteris paribus profiles. These are useful for adding to visual story telling and supporting model interpretability. The idea is great, the implementation is clean, and I may use it in some of my projects. Some minor improvements are suggested below.

Installation: Installed without issues via pip and local copy of the source code.

Functionality: I was able to run the example on my Mac, but was not able to load the plot, due to an issue with how file paths are handled on Macs. I have opened a pull request at https://github.com/ModelOriented/pyCeterisParibus/pull/24 fixing this issue on my machine. After this is accepted or otherwise addressed I will consider it completed. I opened an issue (https://github.com/ModelOriented/pyCeterisParibus/issues/23) regarding the scrollbars obscuring the data. This could be fixed by adding additional padding to the bottom of the frame.

Performance: No measure of performance is given, but the model loaded fast on the Titanic dataset.

Documentation: The explanation for how the model works could be improved. For example, in the paper the author's write "For this purpose, methods for sampling and selecting neighbouring observations are implemented along with the Gower's distance [@gower] function. A more detailed description might be found in the package documentation." I was not able to find description of Gower's distance in the linked to readthedocs. Adding details of how the model works would be helpful for people who are not familiar with Gower's distance or how it applies to machine learning models.

Software Paper: The software paper has a few minor typos or questionable stylistic choices for an academic paper:

"He died on [the] Titanic"
"1. [rather first or 1st] class"
BUT. Not really a typo, but stylistically questionable to have a period after "BUT". Would be more fitting of an academic journal without the word.
Local [I]nterpretable [M]odel-agnostic [E]xplanations (LIME)
Capitalization of "machine learning"
"(e.g.[,] a bank)"
"black-boxes" [no space needed]
"leads to the next industrial revolution" - unverified claim

Example Usage: The notebooks and example scripts without problems.

References: Every reference mentioned in the paper is documented as BibTex entries.

kmichael08 commented 5 years ago

@justinshenk Great thanks for all these valuable remarks! I merged your pull request. Also, I applied your comments referred to the paper and put the Gower's distance description in the documentation. I'll solve the scrollbar problem (ModelOriented/pyCeterisParibus#23) as fast as possible.

kmichael08 commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

JustinShenk commented 5 years ago

@kmichael08 Thanks for the quick response and edits.

@katyhuff My review (https://github.com/openjournals/joss-reviews/issues/1389#issuecomment-484024162) is now complete.

janfreyberg commented 5 years ago

Installation

Installing the package in a fresh docker alpine image leads to dependencies being installed that I don't think need to be, for example sphinx, m2r, codecov, etc.

There are a few ways around this so I haven't made a PR but I think you can do the following:

Split requirements into actual requirements (what's needed to run the package), documentation requirements, and test requirements (e.g. pytest). You can add these as additional requirements using the extras_require key in setup.py, or simply install them from txt files wherever you need them.

Additionally, as far as I can tell tensorflow is never imported and so should be removed from the requirements.

I would even go so far as to say XGBoost and sklearn should not be in the requirements, even though you use it in the paper and documentation, becuase it's not essential to the functioning of the package. Instead, you could make a note that people should install them to run the examples.

Otherwise, installation works great.

Functionality / Performance

This all worked great for me.

Documentation

I think the docs can be improved:

the index page should be clearer; I would remove the automated sphinx content and add more of a "landing" page that contains an introduction
I would consider adding the jupyter notebooks to the docs using a tool like sphinx-jupyter

But that's just a recommendation.

Paper

The paper is very good. Only point: the R package CeterisParibus is not included in the references.

katyhuff commented 5 years ago

Thanks for the speedy reviews, @janfreyberg @justinshenk . And, thanks for responding quickly to the suggestions @kmichael08 .

@kmichael08 , there are a few items in @janfreyberg's review that will need to be handled before we should move forward with acceptance:

[ ] Package installation instructions should be cleaned up to remove tensorflow if it's not used anywhere.
[ ] Though I'm not sure the best citation for it, I do agree with @janfreyberg that the paper would benefit from a citation to the CeterisParibus R package.

The rest of the comments from @janfreyberg would certainly clean things up, but aren't explicitly need for our JOSS requirements, so I'll just recommend that you consider the recommendation from @janfreyberg : "Split requirements into actual requirements (what's needed to run the package), documentation requirements, and test requirements (e.g. pytest). You can add these as additional requirements using the extras_require key in setup.py, or simply install them from txt files wherever you need them.... I would even go so far as to say XGBoost and sklearn should not be in the requirements, even though you use it in the paper and documentation, becuase it's not essential to the functioning of the package. Instead, you could make a note that people should install them to run the examples."

I have looked over the package and have found it installs pretty easily. Once you've seen this message @kmichael08 and implemented the two above changes, please ping me and we'll move on with next steps.

kmichael08 commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

kmichael08 commented 5 years ago

Thanks a lot @janfreyberg and @katyhuff!

I added the citation to the R package
You're right about the requirements. I kept only essential dependencies in requirements.txt and moved others to requirements-dev.txt if that's ok. As for the tensorflow, it is required although not directly imported. There is a test using keras, and this needs some backend DL library (tensorflow here).

So, as far as this two sounds ok for you, we can move on. I'll definitely enhance the docs soon and use sphinx-jupyter. Thanks for that!

katyhuff commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

katyhuff commented 5 years ago

@whedon check references

whedon commented 5 years ago

Attempting to check references...

whedon commented 5 years ago


OK DOIs

- 10.1145/2939672.2939778 is OK
- 10.1080/10618600.2014.907095 is OK
- 10.5281/zenodo.1198885 is OK
- 10.2307/2528823 is OK

MISSING DOIs

- None

INVALID DOIs

- None

katyhuff commented 5 years ago

@kmichael08 I'm going through some of the final checks (first up, the bibliography):

[ ] Should the following doi doi:10.1214/aos/1013203451 be added to: Friedman, J. H. (2000). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232. ? It may be pointing to a different version of the paper than the one you're trying to cite.. but it seems like the right one to me.
[ ] The link in the Rutz, J (2018) citation is too long and overflows the margins. That's not really your fault -- Our latex template already has hypersetup[breaklinks..] so it should be breaking the URL. I don't know why the URL isn't breaking. @arfon -- any ideas on this URL breaking issue? Perhaps the latex template should use breaklinks=true rather than just breaklinks?
bib: https://github.com/ModelOriented/pyCeterisParibus/blob/1f75844d31ec7c5b1442401f1f596b90c657f430/paper/paper.bib#L16
joss latex template: https://github.com/openjournals/whedon/blob/033eedaceb8080759917955949bbd7db36fcfda6/resources/latex.template#L10

arfon commented 5 years ago

Weird. If I change the bib file field to:

howpublished = {\url{https://www.openrightsgroup.org/blog/2018/machine-learning-and-the-right-to-explanation-in-gdpr}},

Then it seems to compile OK. Changing the flag to breaklinks=true doesn't seem to fix anything.

kmichael08 commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

kmichael08 commented 5 years ago

@katyhuff I updated the version of the paper to the one you mentioned (it's ok) and added DOI. As for the url breaking I changed it into the workaround, that @arfon applied above. Let me know if that's ok

katyhuff commented 5 years ago

Thanks @arfon for the workaround and thank you to @justinshenk @janfreyberg for your excellent reviews.

Thank you @kmichael08 for a strong submission and for engaging actively in the review process! I have looked over the paper, double checked all the DOI links, and have conducted a high level review of the code itself. Everything looks ship-shape to me.

@kmichael08 At this point, please double check the paper yourself, if you want to update your code version (e.g. change v5.0 to some minor release representing today's version) review any lingering details in your code/readme/etc., and then make an archive of the reviewed software in Zenodo/figshare/other service. Please be sure that the DOI metadata (title, authors, etc.) matches this JOSS submission. Once that's complete, please update this thread with the DOI of the archive, and I'll move forward with accepting the submission! Until then, now is your moment for final touchups!

kmichael08 commented 5 years ago

@katyhuff I updated repository to the v0.5.2 and archived it in Zenodo. DOI: 10.5281/zenodo.2667756

kyleniemeyer commented 5 years ago

@whedon set 10.5281/zenodo.2667756 as archive

whedon commented 5 years ago

OK. 10.5281/zenodo.2667756 is the archive.

kyleniemeyer commented 5 years ago

@whedon accept

whedon commented 5 years ago

Attempting dry run of processing paper acceptance...

whedon commented 5 years ago

PDF failed to compile for issue #1389 with the following error:

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 13 0 13 0 0 158 0 --:--:-- --:--:-- --:--:-- 160 pandoc: 10.21105.joss.01389.crossref.xml: openFile: does not exist (No such file or directory) Looks like we failed to compile the Crossref XML

whedon commented 5 years ago


OK DOIs

- 10.1145/2939672.2939778 is OK
- 10.1080/10618600.2014.907095 is OK
- 10.1214/aos/1013203451 is OK
- 10.5281/zenodo.1198885 is OK
- 10.2307/2528823 is OK

MISSING DOIs

- None

INVALID DOIs

- None

arfon commented 5 years ago

@whedon accept

whedon commented 5 years ago

Attempting dry run of processing paper acceptance...

whedon commented 5 years ago


OK DOIs

- 10.1145/2939672.2939778 is OK
- 10.1080/10618600.2014.907095 is OK
- 10.1214/aos/1013203451 is OK
- 10.5281/zenodo.1198885 is OK
- 10.2307/2528823 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon commented 5 years ago

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/662

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/662, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

arfon commented 5 years ago

@whedon accept deposit=true

whedon commented 5 years ago

Doing it live! Attempting automated processing of paper acceptance...

whedon commented 5 years ago

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/663
Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01389
If everything looks good, then close this review issue.
Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

katyhuff commented 5 years ago

@kyleniemeyer @arfon Thanks for jumping forward with the submission. That said, I didn't get a chance to execute the whedon set version command in time to beat you to that accept function!

Usually, that's part of my task list at this stage -- do we need to fix and re-accept? That is, the submission was v0.5, but, at my request, the author updated the version when creating the archive release, to reflect the version that includes joss-related changes. The new version, to be incorporated in the JOSS publication, is v0.5.2, so I would usually have run whedon set version before whedon accept. Can you confirm whether this is going to be an issue?

arfon commented 5 years ago

Usually, that's part of my task list at this stage -- do we need to fix and re-accept? That is, the submission was v0.5, but, at my request, the author updated the version when creating the archive release, to reflect the version that includes joss-related changes. The new version, to be incorporated in the JOSS publication, is v0.5.2, so I would usually have run whedon set version before whedon accept. Can you confirm whether this is going to be an issue?

Sorry my/our bad - looks like we got ahead of ourselves here. The version isn't actually captured in the paper so please go ahead and update that here.

arfon commented 5 years ago

Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01389

Also, please note, Crossref is still having some issues so this DOI doesn't resolve yet.

katyhuff commented 5 years ago

@whedon set v0.5.2 as version

whedon commented 5 years ago

OK. v0.5.2 is the version.

katyhuff commented 5 years ago

So (@arfon @kyleniemeyer ) do we just run accept again?

arfon commented 5 years ago

So (@arfon @kyleniemeyer ) do we just run accept again?

There's no need to because the version isn't captured anywhere other than here. The archive DOI is correct right? (This is linked to in the paper)

katyhuff commented 5 years ago

openjournals / joss-reviews

[REVIEW]: pyCeterisParibus: explaining Machine Learning models with Ceteris Paribus Profiles in Python #1389

Status

Reviewer instructions & questions

Review checklist for @janfreyberg

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @justinshenk

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Installation

Functionality / Performance

Documentation

Paper

fancy .