openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: PyEscape #2072

Closed whedon closed 4 years ago

whedon commented 4 years ago

Submitting author: @AoifeHughes (Aoife Hughes) Repository: https://github.com/AoifeHughes/NarrowEscapeSimulator Version: 1.0 Editor: @drvinceknight Reviewer: @pdebuyl, @markgalassi Archive: 10.5281/zenodo.3725946

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/c47ec67686a14361072ed703a58bac15"><img src="https://joss.theoj.org/papers/c47ec67686a14361072ed703a58bac15/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/c47ec67686a14361072ed703a58bac15/status.svg)](https://joss.theoj.org/papers/c47ec67686a14361072ed703a58bac15)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@pdebuyl & @markgalassi, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @drvinceknight know.

Please try and complete your review in the next two weeks

Review checklist for @pdebuyl

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @markgalassi

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 4 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @pdebuyl, @markgalassi it looks like you're currently assigned to review this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 4 years ago
Reference check summary:

OK DOIs

- 10.1007/s10955-004-5712-8 is OK
- 10.1073/pnas.0706599104 is OK

MISSING DOIs

- None

INVALID DOIs

- None
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

pdebuyl commented 4 years ago

@drvinceknight I can't tick the boxes in the review checklist. From memory, I was able to edit the checklist directly in my earlier work with JOSS. Is there anything that I should do to get the authorization?

drvinceknight commented 4 years ago

@drvinceknight I can't tick the boxes in the review checklist. From memory, I was able to edit the checklist directly in my earlier work with JOSS. Is there anything that I should do to get the authorization?

That's strange, I've just checked and I can tick them so I'd assume you can because I believe you have all the necessary authorisation. Could you double check and also perhaps try on a different browser?

pdebuyl commented 4 years ago

My bad @drvinceknight I needed to renew my invitation to the reviewers group. I thought that it would not be needed for reviewers having already served.

drvinceknight commented 4 years ago

No problem, glad it's sorted :+1:

pdebuyl commented 4 years ago

Hi @AoifeHughes ,

I had a first look at the program. I believe that documentation needs to be improved. I have written a first review below. I suggest that you take a look at existing scientific codes on GitHub, preferably with a JOSS paper, to have a good impression of what is expected. The purpose of the review is to eventually tick all the boxes, so I can provide more information if needed.

Functionality

Installation: Does installation proceed as outlined in the documentation?

pip install does not work because of permissions. Instead of using sudo pip install (which users could easily find by googling), I suggest to propose pip install --user instead.

Functionality: Have the functional claims of the software been confirmed?

I ran the Example notebook with success. I would like to have a least a few benchmark cases.

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

The problem is well stated. The target audience is not.

Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

A requirements.txt file is missing. This is the most standard way to state dependencies for Python projects.

Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

There is one example notebook. I suggest to improve the example by adding a textbook-type example, with known numerical output.

Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

No. There are no docstring and no module-wide documentation page. The purpose of the routines is not documented explicitly, only the example, the name of the functions, and the source code provide that information.

Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

There is limited testing. The test checks against an upper limit. An output time of zero would pass the test. There is no theoretic estimate provided for the test. Playing with the test case repeatedly, I had a 1% failure rate.

Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

There is a contribution section in the readme. It only applies to new features and not to bug reports, other issues, or support requesets.

AoifeHughes commented 4 years ago

@pdebuyl Thank you for the detailed response, and for taking the time to review. I will address these issues and report back within the next week.

AoifeHughes commented 4 years ago

I have just pushed a series of commits which I feel address most of the concerns:

Functionality

Documentation

Other

I am happy to take on board any other issues that reviewers may have and will address them promptly also. Thank you again for your valuable input.

AoifeHughes commented 4 years ago

I would also like to add, we are currently preparing to submit a paper which will cite this software, when it is published we will be able to provide additional real-world solutions. Until then, we are reluctant to add additional examples to prevent self-plagiarism or to remove novelty from our current research.

drvinceknight commented 4 years ago

@markgalassi apologies for the nudge, do you know when you might have a moment to carry out your review?

pdebuyl commented 4 years ago

@AoifeHughes the doi for the JOSS paper should already be known. @drvinceknight is it ok to cite a JOSS paper before its acceptance?

AoifeHughes commented 4 years ago

I don't think it would be appropriate to cite before acceptance, the DOI currently is inactive it seems.

drvinceknight commented 4 years ago

Yes the DOI is not yet an actual representation of the software as it might be modified still. I'm still waiting to hear back from @markgalassi but if I don't soon I will look for another reviewer.

markgalassi commented 4 years ago

I just got back from travel and dove in to the paper. I am afraid that it is clearly not a candidate for anything yet. The simplest of things (following the procedure in README.md) fails: pores = fibonacci_spheres(p, v)

gives:

NameError: name 'p' is not defined

Clearly some variable p should be set, but for a minimal example which is the first thing shown this is not given!

Please have the author double-check that the README.md works and then we can get back to work on evaluating this software.

markgalassi commented 4 years ago

Let me also add a comment on the idea of notebooks as documentation: I know they are much in vogue, but there are people (like me) who insist on reproducible procedures for running python code as batch. This means that I am not in the habit of loading notebooks, and if I am presented with documentation as a notebook I want a command line invocation which will load all that. So pointing to a directory is not enough: please give a command to load that notebook up from a standard GNU/Linux system.

markgalassi commented 4 years ago

(sorry: clicked wrong button and "close"ed the issue; I think that this reopens it)

AoifeHughes commented 4 years ago

Thanks for your comments.

I have added the required lines to the read me.

Re Notebooks: For the case of being documentation and providing an example of how to use the software, then the integrated github viewer should be enough. This is presented as a method without results (beyond solving a known problem), so I wouldn't expect running it to be a complete necessity. Saying that however, most users will want to run notebooks differently and depending on your current setup you should be able to run jupyter-notebook <notebook name> from your command-line.

Though, if you're wanting that command-line experience then running

narrow_escape -D 400 -v 1 -a 0.1 -p 1 -N 1000 -dt 2e-9

# "narrow_escape --help" can give arguments info

from bash (providing you installed the library) will give the same results, without plots as in the example notebook.

pdebuyl commented 4 years ago

Thank you for the update. I ticked many checks for the review already. Some items remain to improve, especially given the focus of JOSS on best practices (including testing).

  1. The comment "base install of python (not recommended)" is superfluous.

  2. The tests are not well defined. For instance, the test for the sphere placement routine does not check the distance of the pore but that it falls below a much larger upper bound.

  3. The tests use floating-point equality, which is not a reliable way to perform numerical tests. The most convenient solutions are either NumPy's testing routines (that include "closeness" tests) or pytest's approx feature.

  4. The escape tests now does not include a bound. The earlier version was better but an estimate for this time must be given. Only "0" would not pass the test now.

  5. scipy is not used in the code, it should be removed from the install_requires.

  6. There are unused imports and unused variables in the test files, they should be removed.

  7. The duplicate "%notebook" directive does not work as intented in the notebook example (the directive can only be used once per notebook).

  8. The optional argument max_steps does not limit the range of searches that exit the container. Picking a dt that is too large results in infinite loops.

  9. If max_steps is reached, the escape routine returns max_steps*dt instead of an obviously erroneous values (0 or np.nan).

I filed pull requests for 7 and part of 6.

markgalassi commented 4 years ago

Further testing: thanks @AoifeHughes for addressing the issues I reported previously. Smaller ones now:

  1. Since you document how to run before you document installation, you should slip in a "after installing (according to the instructions below)" here and there.
  2. Because of the python community's sad failure in updating the world to python3, you might want to mention that on many systems they might have to run "pip3 install ."
  3. The second test python program fails with a:
    $ python3 t2.py 
    Traceback (most recent call last):
    File "t2.py", line 9, in <module>
    pores = fibonacci_spheres(p, v)
    NameError: name 'p' is not defined

    So p and dt are not defined. This is the usual problem with the notebook people trying to write reproducible and deliverable software and documentation :-) So the author should definitely test all those examples.

Otherwise my earlier problems seem to have been addressed and I am continuing with the checklist.

AoifeHughes commented 4 years ago

Again, thank you for the comments and suggestions. Particularly towards testing and the errors in the examples, I was running them sequentially and so variables were carried over.

I've made updates to the readme to remove the unnecessary text also. I'm reluctant to mention pip3 commands, simply because I state twice that this is a python 3 library and adding pip3 would cause errors for some setups. For example:

zsh> which pip
/Users/AoifeHughes/anaconda3/bin/pip
zsh> which pip3
/usr/local/bin/pip3
zsh> conda activate myenv; which pip
/Users/AoifeHughes/anaconda/envs/bin/pip

I've also updated several tests to use the numpy floating point closeness functions. Removed scipy from the setup.py. Cleaned up some unused imports in testing Added a test for max_steps variable being hit Reaching max_steps now returns a zero value (I am working on writing some warnings to be raised also) Some tests still need to be addressed (in progress)

Pull requests for minor fixes have been accepted.

markgalassi commented 4 years ago

I have checked most of the boxes in the paper checklist. Here is a small suggested diff (below) so you don't assume that users are steeped in markdown. I would suggest that the "statement of need" in the readme.md file could do with a bit more, and that the paper put a math citation at the moment of stating "The mathematical models provided are simple and robust @iWouldCiteMySourceHere". I also have not yet checked the "state of the field" checkbox. In the paper you discuss the Schuss and Holcman papers, but no reference to existing software. You say yours is "novel" but you do not say that there is no "narrow escape" softare.

And another nit: when I did a "git diff" after running your notebook procedure I saw a diff on a date in the notebook. Remember: notebooks are not reproducible and they are not entirely human source, so it is flawed to add them to a version control repo. I would not hold up the paper on this, but I would recommend that you put reproducibility and best VC practices front-and-center: commit a .py file, then provide a simple one-liner to load that into a notebook for notebook types.

Other those simple suggestions I think you meet the criteria and I would quickly finish this off.

And if I might add a personal note: your project is interesting! It made me want to read up on narrow escape and take an interest in the topic.

diff --git a/paper.md b/paper.md
index c78409e..bd9404c 100644
--- a/paper.md
+++ b/paper.md
@@ -1,3 +1,9 @@
+---
+# process with:
+# pandoc --bibliography paper.bib -o paper.pdf paper.md
+# pandoc --standalone --bibliography paper.bib -o paper.html paper.md
+---
+
 ---
 title: "PyEscape: A narrow escape problem simulator package for python"
 tags:
AoifeHughes commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

AoifeHughes commented 4 years ago

I do see your points on issues with notebooks - I will move away from these in future (currently investigating your suggestion to make them non issues with tracking).

And thank you for your interest, we were surprised when we found how little narrow escape problems have been applied to problems outside of physics. We hope this library will help with that by making it more accessible. As per your suggestion I've added a recent paper giving a mathematical solution for a similar problem, but which lacks a programmatically useable one.

arfon commented 4 years ago

Dear authors and reviewers

We wanted to notify you that in light of the current COVID-19 pandemic, JOSS has decided to suspend submission of new manuscripts and to handle existing manuscripts (such as this one) on a "best efforts basis". We understand that you may need to attend to more pressing issues than completing a review or updating a repository in response to a review. If this is the case, a quick note indicating that you need to put a "pause" on your involvement with a review would be appreciated but is not required.

Thanks in advance for your understanding.

Arfon Smith, Editor in Chief, on behalf of the JOSS editorial team.

markgalassi commented 4 years ago

I just reviewed again and see that "A Statement of Need" is now in the readme.md file (it is awkwardly named, but it does the job). In the paper you also have more of a reference to existing software efforts. I have checked off the missing boxes.

markgalassi commented 4 years ago

@whedon commands

whedon commented 4 years ago

Here are some things you can ask me to do:

# List Whedon's capabilities
@whedon commands

# List of editor GitHub usernames
@whedon list editors

# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers

EDITORIAL TASKS

# Compile the paper
@whedon generate pdf

# Compile the paper from alternative branch
@whedon generate pdf from branch custom-branch-name

# Ask Whedon to check the references for missing DOIs
@whedon check references

# Ask Whedon to check repository statistics for the submitted software
@whedon check repository
markgalassi commented 4 years ago

I think I might be done and I'm happy to recommend the software+paper as it is now. I also find that my fellow reviewer was a bit more subtle in some of their comments so I would encourage the author to incorporate those comments, but I would not require it for readiness. Editor: do I have to take any specific steps beyond stating this here?

drvinceknight commented 4 years ago

Editor: do I have to take any specific steps beyond stating this here?

No @markgalassi that's it! Thank you for your work, it's appreciated.

pdebuyl commented 4 years ago

Hi @AoifeHughes (sorry for the delay. I still work but organization is different. Belgium ordered confinement, all nonessential displacements are forbidden) , there remain issues among the ones I raised.

  1. The tests are not well defined. For instance, the test for the sphere placement routine does not check the distance of the pore but that it falls below a much larger upper bound.

In the test below, the sphere is of volume 1 (thus radius of about 0.62) but the test in_sphere is for a radius of 1, so the test is too tolerant.

  def test_fibo_spheres():
    assert in_sphere(fibonacci_spheres(1, 1)[0], 1)

A variation such as in_sphere(fibonacci_spheres(1, 1)[0], R+eps) where R is "(3/(4*pi))**(1/3)" and eps a suitably small value (order accuracy of floating point variables) would be much more useful. A similar remark applies to test_detection.py

  1. The tests use floating-point equality, which is not a reliable way to perform numerical tests. The most convenient solutions are either NumPy's testing routines (that include "closeness" tests) or pytest's approx feature.

Here the change is good, except that requiring "decimal=2" for test_cube_vol_to_r and test_sphere_vol_to_r feels overly inaccurate.

  1. The escape tests now does not include a bound. The earlier version was better but an estimate for this time must be given. Only "0" would not pass the test now.

This has not been addressed. assert t will only fail for t=0 and this is worse than the previous generous upper bound.

  1. The optional argument max_steps does not limit the range of searches that exit the container. Picking a dt that is too large results in infinite loops.

This has not been addressed. In more detail: the increase of max_steps occurs in the outer loop:

https://github.com/AoifeHughes/NarrowEscapeSimulator/blob/master/narrow_escape/escape_plan.py#L41

The while loop just below will go on indefinitely if one does not find a suitable new_pos, for instance if dt is too large.

AoifeHughes commented 4 years ago

Thank you @pdebuyl, I understand that for many these are difficult times, I hope yourself and other reviewers and editors are under no pressure with regards to JOSS / this submission as it is not in any way time-dependant and your own wellbeings are much more vital.

I'll address your feedback in order:

  1. I've just added 2 new tests, and modified the others that you've pointed out. I now test for tighter bounds to ensure that functions are working correctly.

  2. Upped the numpy testing to 5 decimal places

  3. Regarding timeouts and escape times. I am unsure of how to best handle this. Due to the nature of this being stochastic I'm wary of putting concrete bounds. I've attempted to address this concern by implementing another test which runs for N simulations and ensures that the given result is within a reasonable expected range. This test also has a timeout attached on it, should this test fail then it is a reasonable indication that something is misbehaving.

  4. Thank you for pointing out the inner while look problem, I now see. I've implemented a correction in the latest commit. As a side note, I recognise this has increased issues with code repetition. An issue I was hoping to iron out, and will refactor in the future.

4b. I've added a test with a large dt that breaks similarly to max_steps when required.

pdebuyl commented 4 years ago

Hi @AoifeHughes thank you for the update. I understand that stochastic simulations are difficult to test!

I am satisfied will the reply to all my comments. I have a few minor remaining concerns, none preventing acceptance but I prefer to mention them.

  1. I filed a PR to add the import of Axes3D from mpl_toolkits.mplot3d to enable the 3d plot in the notebook.
  2. NumPy should be cited in the paper. "We used NumPy [citation] for the random number generation, for testing, and for storing the numerical data in arrays." With the citation recommended at https://scipy.org/citing.html#numpy pinging @drvinceknight for the policy about this.
  3. The repository name is NarrowEscape, the pull request here and the paper title are PyEscape, the actual library is narrow_escape. I suggest a bit of cleaning up there. If you plan to put the code on PyPI at some point, this will confuse the users.

Apart from the citation issue (the only unticked box), I approve the paper for publication and leave the other issues to your choice.

pdebuyl commented 4 years ago

PS: thank you for the coronavirus note, same encouragements to all here!

pdebuyl commented 4 years ago

+nitpick: in the example notebook, the timestep is 2e-9 whereas calculate_opt_dt returns 4.2e-6, this is rather inconsistent. If 4.2e-6 is too small, calculate_opt_dt should return a smaller value.

AoifeHughes commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

AoifeHughes commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

AoifeHughes commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

AoifeHughes commented 4 years ago

Apologies for making Whedon work extra hard, I've made a couple of final changes for now:

  1. I've refactored the naming of the library to suit the paper title better, as per the suggestion, it was overdue on my part
  2. I added a section to the paper citing Numpy, Matplotlib and Jupyter as they are key external libraries used
  3. Calculations for optimal dt has been improved by 2 orders of magnitude, code will run slower but be more accurate, ultimately.

Again, thank you all for the feedback and help with this.

pdebuyl commented 4 years ago

Thanks for the update. It's all ok for me but you probably want to undelete the code though. git add PyEscape and git commit should do :-)

@drvinceknight I recommend for publication.

drvinceknight commented 4 years ago

Thanks @pdebuyl and @markgalassi for your review.

I've got a few last checks to do now @AoifeHughes :+1:

drvinceknight commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

drvinceknight commented 4 years ago

@whedon check references

whedon commented 4 years ago
Reference check summary:

OK DOIs

- 10.1109/MCSE.2007.55 is OK
- 10.1109/MCSE.2011.37 is OK
- 10.1007/s10955-004-5712-8 is OK
- 10.1073/pnas.0706599104 is OK
- 10.1098/rsif.2008.0014 is OK
- 10.1016/j.jcpx.2019.100047 is OK

MISSING DOIs

- None

INVALID DOIs

- None