openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
712 stars 38 forks source link

[REVIEW]: graphenv: a Python library for reinforcement learning on graph search spaces #4621

Closed editorialbot closed 2 years ago

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@pstjohn<!--end-author-handle-- (Peter C. St. John) Repository: https://github.com/NREL/graph-env/ Branch with paper.md (empty if default branch): Version: v0.1.3 Editor: !--editor-->@osorensen<!--end-editor-- Reviewers: @iammix, @vwxyzjn, @Viech, @idoby Archive: 10.5281/zenodo.7030161

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/1d1427f9c72a4a7fb9ac2a53506e0733"><img src="https://joss.theoj.org/papers/1d1427f9c72a4a7fb9ac2a53506e0733/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/1d1427f9c72a4a7fb9ac2a53506e0733/status.svg)](https://joss.theoj.org/papers/1d1427f9c72a4a7fb9ac2a53506e0733)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@iammix & @vwxyzjn & @Viech & @idoby, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @osorensen know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @iammix

📝 Checklist for @vwxyzjn

📝 Checklist for @idoby

📝 Checklist for @Viech

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 years ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.07 s (818.7 files/s, 128621.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          28            967           1577           2962
Markdown                         6            108              0            319
Jupyter Notebook                 7              0           2307            279
YAML                             6             36            115            216
TeX                              1             12              0            134
reStructuredText                 7             57            131             62
JSON                             1              0              0             58
DOS Batch                        1              8              1             26
Bourne Shell                     1              1              7             16
make                             2              5              7             15
-------------------------------------------------------------------------------
SUM:                            60           1194           4145           4087
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 2 years ago

Wordcount for paper.md is 1026

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.patter.2021.100361 is OK
- 10.1039/d1sc02770k is OK
- 10.1038/s41597-020-00588-x is OK
- 10.1038/s41467-020-16201-z is OK
- 10.1007/978-3-030-50426-7_33 is OK
- 10.1038/s41598-019-47148-x is OK

MISSING DOIs

- None

INVALID DOIs

- 10.1038/s42256-022-00506-3 is INVALID
editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

idoby commented 2 years ago

Review checklist for @idoby

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

iammix commented 2 years ago

Review checklist for @iammix

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

idoby commented 2 years ago

This seems like a very interesting package! I will have the time to dive in more in-depth soon, but in the meanwhile, can the authors fix the citation on line 17 of the paper to be consistent with the other citations, and resolve the invalid citation issue?

Also, please clarify the contributions of the two extra authors who are not listed as collaborators on the GitHub repo.

Thanks!

pstjohn commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 2 years ago

Hello @idoby, here are the things you can ask me to do:


# List all available commands
@editorialbot commands

# Get a list of all editors's GitHub handles
@editorialbot list editors

# Check the references of the paper for missing DOIs
@editorialbot check references

# Perform checks on the repository
@editorialbot check repository

# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist

# Set a value for branch
@editorialbot set joss-paper as branch

# Generates the pdf paper
@editorialbot generate pdf

# Get a link to the complete list of reviewers
@editorialbot list reviewers
editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.patter.2021.100361 is OK
- 10.1039/d1sc02770k is OK
- 10.1038/s41597-020-00588-x is OK
- 10.1038/s41467-020-16201-z is OK
- 10.1007/978-3-030-50426-7_33 is OK
- 10.1038/s41598-019-47148-x is OK

MISSING DOIs

- None

INVALID DOIs

- 10.1038/s42256-022-00506-3 is INVALID
pstjohn commented 2 years ago

Apologies if these contribution edits should have been made in the manuscript?

That invalid DOI, 10.1038/s42256-022-00506-3, should be live starting 04 August 2022 at 11:00 (US Eastern Time)

idoby commented 2 years ago

Apologies if these contribution edits should have been made in the manuscript?

I'm pretty new at reviewing for JOSS, but the documentation does not state this information has to be specified in the manuscript. So I think just your comment should be fine.

That invalid DOI, 10.1038/s42256-022-00506-3, should be live starting 04 August 2022 at 11:00 (US Eastern Time)

OK, LGTM then. Appreciate your work on the library and thank you for contributing it. :)

idoby commented 2 years ago

@pstjohn Looking at it now, it seems like the merge commit for Dmitry's PR prevented him being listed as a contributor by GitHub. I'm not sure why, since it usually works fine.

osorensen commented 2 years ago

Apologies if these contribution edits should have been made in the manuscript?

I'm pretty new at reviewing for JOSS, but the documentation does not state this information has to be specified in the manuscript. So I think just your comment should be fine.

That invalid DOI, 10.1038/s42256-022-00506-3, should be live starting 04 August 2022 at 11:00 (US Eastern Time)

OK, LGTM then. Appreciate your work on the library and thank you for contributing it. :)

Wanted to chime in as editor that I agree with you. Comments in this thread are sufficient.

vwxyzjn commented 2 years ago

Review checklist for @vwxyzjn

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

pstjohn commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.patter.2021.100361 is OK
- 10.1039/d1sc02770k is OK
- 10.1038/s41597-020-00588-x is OK
- 10.1038/s41467-020-16201-z is OK
- 10.1038/s42256-022-00506-3 is OK
- 10.1007/978-3-030-50426-7_33 is OK
- 10.1038/s41598-019-47148-x is OK

MISSING DOIs

- None

INVALID DOIs

- None
vwxyzjn commented 2 years ago

The Paper and repo look good! Opened a couple of issues:

https://github.com/NREL/graph-env/issues/33 https://github.com/NREL/graph-env/issues/34

Also, please consider filling out the github repo's description and keywords. See the screenshot below as an example.

Screen Shot 2022-08-08 at 12 30 39 PM
Viech commented 2 years ago

Review checklist for @Viech

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Viech commented 2 years ago

I have completed my review with some remarks below and will tick my remaining boxes when these points were addressed or discussed. Overall, I find that this is a small but nicely maintained and documented package whose scope could be described a bit better to a non-specialist audience.

General checks

Substantial scholarly effort

Functionally this is a very small library: Excluding the examples subfolder, there are significantly less than 1000 lines of code in the graphenv folder that appears to contain the library as such, so I should flag this to @osorensen. This code gives the impression of a joint wrapper around two existing libraries. On the other hand, a substantial effort is apparent in the presentation and integration of this code in the form of tests, examples, and deploymet scripts. I can also see that the functionality offered is useful and not accessible to users not familiar with the backend libraries used.

Functionality

Installation

Documentation

A statement of need

Installation instructions

Manuscript

I find that the manuscript is well-written and concise but the distinction between graph search problems and other problem representations could be made more clear as this seems to be the selling point for the software.

Author list

Summary

Statement of need

osorensen commented 2 years ago

Thanks for your review @Viech!

Regarding scholarly scope, the guidelines state as some factors to consider, which I will consider point-by-point below, while trying to "think out loud".

  • Age of software (is this a well-established software project) / length of commit history.

Looking at the repository, I find that the software seems rather young, with the first commit made on February 11 2022.

  • Number of commits.

This is rather large, 241 in total.

  • Number of authors.

There have been four contributors, which suggests that this is a joint effort.

  • Total lines of code (LOC). Submissions under 1000 LOC will usually be flagged, those under 300 LOC will be desk rejected.

This one has already been commented on in @Viech's review.

  • Whether the software has already been cited in academic papers.

Could you please comment on this @pstjohn? Is the software used in academic paper, or are you aware of papers in preparation which will use the package.

  • Whether the software is sufficiently useful that it is likely to be cited by your peer group.

This is an important point. Making a joint wrapper around two existing libraries can in itself be very useful for users not familiar with those libraries.

In sum, I understand your concern about scholarly scope @Viech, and see that this is a difficult case. I would also be interested to hear what @iammix thinks of this, as the "Substantial scholarly effort" button has not been ticked off in your review checklist.

Also @Viech, could you think of any suggested extensions of the package that would make it more clearly within scope for JOSS?

You're of course also welcome to comment on this @pstjohn.

idoby commented 2 years ago

@osorensen Just adding a comment here since I did check the respective checkbox. Following the discussion here I went back and checked the commit history in greater depth and found that a large number of commits are just one-liners or seem to be purely about cosmetics/experimenting with settings.

So I would suggest the authors

Personally, I believe git history to be more than just a storage space for old versions of the code, it also serves to document the design and development process for a project, document decisions made, etc.

(also, apologies, this is my first review and I didn't realize reviews could extend beyond the checklist)

osorensen commented 2 years ago

Thanks @idoby! And for the record, you're also free to update your checklist, so that items are unchecked until certain issues have been fixed.

Viech commented 2 years ago

Also @Viech, could you think of any suggested extensions of the package that would make it more clearly within scope for JOSS?

I'm more into "classical" optimization than machine learning, so I would not know about suited core features to be added. However one could certainly add more interface features: Default implementations of the while not done main loop with options for detailed output/logging of what is going on and control over stopping criteria, ability to stop, save, load, and continue a search, plotting of (parts of) the search graph, statistics concerning this graph, or additional ways to define a problem (e.g. by assigning callback functions to properties of some GraphSearchProblem instance so that the library can be used in an interactive console) are some potential features that come to my mind. I'm also just thinking out loud here; some of these ideas might not apply to graphenv. The general direction of these suggestions is that, ideally,

  1. users of graphenv should be able to get basic functionality out of it without importing gym or ray.rllib themselves and/or
  2. they should be offered advanced "quality of life" features if they are already comfortable using these backend libraries directly.

With either of these directions pursued, the package would be clearly within scope as I understand it.

osorensen commented 2 years ago

Thanks @Viech. Then I suggest that the authors (@pstjohn) work along the lines proposed in your comment above, in addition to addressing the points raised in the review.

iammix commented 2 years ago

Hello, sorry for my late response. I will finish my review by the end of the week.

pstjohn commented 2 years ago

Hi all,

Thanks for the excellent comments (so far) on this effort! We've made a few changes to the repo and manuscript (see issues NREL/graph-env#33, NREL/graph-env#34, NREL/graph-env#36, NREL/graph-env#37, NREL/graph-env#39, and NREL/graph-env#40 for additional discussion), and here I'll respond generally to some of the bigger picture comments on the size, age, and git history that have been raised.

This library was originally conceived as a refactor to our NREL/rlmolecule library, which used significantly more code to accomplish a similar task. We felt the amount of effort to integrate a tree search gym environment with a custom policy model in RLlib was nontrivial and might be of interest to the growing field of researchers looking at applying RL to new optimization problems. This in part explains the age of the library, our original work on these types of problems began in 2019, but we only recently refactored the code in this more general library. Our recent publication cites that rlmolecule repository, although a lot of that work was ported here.

We subsequently spent a long time attempting to reduce the size of the library, trying to optimize speed and code maintainability. For instance, we removed the 177 line space_util.py functions after finding ways to leverage the built-in repeated action space within rllib.

This also explains in part the large number of "small" commits; debugging RLlib can be challenging (as most of the work occurs in forked worker processes) and as a result we took a rather circuitous route to our current implementation. We're documenting current "gotcha's" in our online documentation, but could add a "Lessons Learned" section to our documentation moving forward to keep track of knowledge gained through failed experiments.

On re-writing the git history, I'm not confident enough in my git abilities to force-push main on a repo with PyPI integration and automated docs tied directly to SCM version tags :cold_sweat:. But moving forward, we will certainly package major development milestones into pull requests and merge those as a single commit.

We appreciate everyone's involvement so far! Please let us know if any of our responses could use additional detail. We put a response together around project scope in NREL/graph-env#37, and would be happy to continue those discussions there or in this thread.

(signed all authors)

Viech commented 2 years ago

I've reviewed the changes to the manuscript and documentation and they look good to me!

With respect to scope I want to stress that I can see this package being used and cited in research and that the code and repository quality look pretty good; so I've ticked that box now even though not much new substance was added to the core code at this point. I also understand that using compact constructs can improve code quality while reducing size and that the efficient use of a technical backend library can be a feature in itself.

If I may chime in concerning a history rewrite: I think this should not be relevant for publication. I see the value that a nice history has for understanding the ideas behind pieces of code but rewriting git history is typically considered bad style as it breaks forks, branches, and leaves important tags detached from the new history. I would suggest this in extreme cases but not here.

pstjohn commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

pstjohn commented 2 years ago

@osorensen, just wanted to check in on this. Let me know if there's any other edits we could make while we wait for the last review to come in.

Thanks!

osorensen commented 2 years ago

Thanks for following up on this @pstjohn. I think we'll wait some more days for @iammix to complete his review. Until then I don't think there's anything for you to do.

iammix commented 2 years ago

Thanks @pstjohn for your submission. First of all, you have done remarkable work on it. Nowadays, it is generally accepted to combine different tools to create a wrapper of them. In my opinion, this will be the future of research. You have a well-organized documentation page, where you have provided the background of the examples. "Lesson Learned" is a great add-on to the documentation and I am looking forward to that.

As far as the code is concerned, I noticed that it is easy for someone to be able to extend the code and create their custom methods(new features etc). Moreover, you impemented the relatively new feature of python(typing) where you declare the type of arguments and returns of your methods.

I've done some minor changes(fix typos, return types at the declaration of the methods, PEP8 python Style Guide format ). Check my PR https://github.com/NREL/graph-env/pull/47

pstjohn commented 2 years ago

Thanks @iammix for the detailed review and fixes!! I've merged https://github.com/NREL/graph-env/pull/47, I'll pin a new release (likely 0.1.0) if/when the paper is accepted.

@osorensen, let us know what the next steps would be

Thanks again everyone for the helpful comments

osorensen commented 2 years ago

@pstjohn At this point could you:

In the meantime I'll read through the manuscript again and let you know if I have any suggested changes.

osorensen commented 2 years ago

@editorialbot generate pdf

osorensen commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/j.patter.2021.100361 is OK
- 10.1039/d1sc02770k is OK
- 10.1038/s41597-020-00588-x is OK
- 10.1038/s41467-020-16201-z is OK
- 10.1038/s42256-022-00506-3 is OK
- 10.1007/978-3-030-50426-7_33 is OK
- 10.1038/s41598-019-47148-x is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

osorensen commented 2 years ago

@pstjohn, as you see I've open three small issues in the source repository. When you have addressed these, and the four points points about release and archiving mentioned in my comment yesterday, I'm ready to go forward with acceptance.

pstjohn commented 2 years ago

Took a couple tries to get the .zenodo.json metadata in the right format, but I think I've made the requested paper edits and these associated tags:

pstjohn commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.48550/arXiv.1806.01261 is OK
- 10.11578/dc.20201221.3 is OK
- 10.48550/arXiv.1606.01540 is OK
- 10.48550/arXiv.1712.09381 is OK
- 10.1016/j.patter.2021.100361 is OK
- 10.1039/d1sc02770k is OK
- 10.1038/s41597-020-00588-x is OK
- 10.1038/s41467-020-16201-z is OK
- 10.1038/s42256-022-00506-3 is OK
- 10.1007/978-3-030-50426-7_33 is OK
- 10.48550/arXiv.2011.06069 is OK
- 10.1038/s41598-019-47148-x is OK

MISSING DOIs

- None

INVALID DOIs

- None
pstjohn commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

osorensen commented 2 years ago

@editorialbot set 10.5281/zenodo.7030161 as archive

editorialbot commented 2 years ago

Done! Archive is now 10.5281/zenodo.7030161

osorensen commented 2 years ago

@editorialbot set v0.1.3 as version