Closed whedon closed 5 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @terrytangyuan, it looks like you're currently assigned as the reviewer for this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
Attempting PDF compilation. Reticulating splines etc...
@nproellochs Could you clarify the differences between your package and reinforcelearn? Also, have you done any benchmarking or used the package in certain applications yet as I doubt that this package is suitable and performant enough for practical use cases given it's implemented purely in R?
@terrytangyuan The key differences between our package and the reinforcelearn package are as follows: 1) Different from the reinforcelearn package, our package performs batch reinforcement learning (it allows to learn the optimal behavior of an agent based on sample sequences consisting of states, actions and rewards). Batch reinforcement learning drastically reduces the computation time in most settings. 2) Our package is shipped with the built-in capability to sample experience from a function that defines the dynamics of the environment. 3) Our package incorporates a variety of learning algorithms to improve the performance of (batch) reinforcement learning.
We also tested the suitability of the package in multiple application scenarios. For example, our paper "Negation scope detection in sentiment analysis: Decision support for news-driven trading" makes use of the package to learn negations based on reinforcement learning (the paper is available from https://www.sciencedirect.com/science/article/pii/S0167923616300823).
@nproellochs By "performance", you meant the statistical performance, right? What about computational performance and benchmarks? RL is mostly popular in large scale applications but your package doesn't seem to support that.
@terrytangyuan In typical application scenarios, batch reinforcement learning also speeds up computational performance as it mitigates the ‘exploration overhead’ problem in pure online learning. In combination with experience replay, it speeds up convergence by collecting and replaying observed state transitions repeatedly to the agent as if they were new observations collected while interacting with the system. An overview of the method can be found in our paper (https://arxiv.org/pdf/1810.00240.pdf). Nonetheless, due to the fact that the package is written purely in R, I agree that its applicability to very large scale problems (such as applications in computer vision) is limited.
:wave: @terrytangyuan @kbenoit - happy new year! How are you getting along with your reviews?
@nproellochs Thanks for the clarification. I understand your point but I don't see how useful and practical this software can be if it cannot be applied to the most popular RL scenarios (mostly due to computational performance overhead). I am uncertain how valuable this is to publish in JOSS. @arfon How do we go from here? Any thoughts @kbenoit?
Hi - Sorry for the delay. I'm currently on holidays but back to work on Jan 23 and happy to complete my review then.
In the meantime: On the performance issue, this is really a question for the package and its authors to demonstrate. If there is a vignette or demonstration in the JOSS paper showing that despite being all R, that it can be useful for some applications, then I consider this a satisfactory answer to the question. Alternatively, the case could be made that it's a demonstration or learning package for RL for R users. Only if it's impractical for nearly any application because of performance issues would it be inappropriate as an R package or for JOSS.
@nproellochs Thanks for the clarification. I understand your point but I don't see how useful and practical this software can be if it cannot be applied to the most popular RL scenarios (mostly due to computational performance overhead). I am uncertain how valuable this is to publish in JOSS. @arfon How do we go from here? Any thoughts @kbenoit?
@terrytangyuan - this is a fair point. We have a sister journal JOSE which may be a better fit for this package if @kbenoit is of the same opinion as @terrytangyuan here.
Hi - Sorry for the delay. I'm currently on holidays but back to work on Jan 23 and happy to complete my review then.
That would be great. Thanks!
Friendly reminder to pick up this review sometime soon please @kbenoit
Sorry for the delay guys, here are my comments. All of my suggestions are below but on the JOSE option above, I think it has to do below with the purpose of the package and the need it addresses, which I think needs additional clarification before we can answer this definitively. Do I think the package should be rejected from JOSS because it cannot solve large-scale computer vision problems? No, not unless that is what it claims to solve. So for me, a key unanswered question is exactly what need it claims to meet. See more below.
Not exactly, as the GitHub repo has not tagged releases. Suggest you use this very useful facility. https://help.github.com/articles/creating-releases/. I can verify that the current master branch on the GitHub repo matches the submitted and the CRAN version.
I find this uncertain, since in the vignette, there is no actual execution of the code, and no interpretation or discussion of the results or the format in which they are delivered. How I am to understand the output of policy(model)
, which is a named character
vector such as
> head(policy(model))
.XXBB..XB XXBB.B.X. .XBB..BXX BXX...B.. ..XB..... XBXBXB...
"c1" "c5" "c5" "c4" "c5" "c9"
None exist in the repo or in the README.md
, but these would be easy to add.
None of the three checklist items are met in the current version of the paper. See below for my comments on the statement of need and the purpose of the package.
It's not entirely clear what the core purpose of the package is, given the performance limitations acknowledged above. Either this is a practical solution for applying reinforcement learning to problems that people need to solve, in which case there needs to be a discussion o of its performance limitations and some comparison to other possible solutions, in the paper and in my opinion, in the package vignette and/or repository README. The current statement of purpose (in the paper) suggests that the package exists to provide an R package for model-free learning, implying that its purpose is to provide a practical tool for applying this model to a broad class of problems. If so, then some discussion of performance is needed.
Or, this is a purely R-based implementation of reinforcement learning that offers a way to test or take apart this class of machine learning model for primarily educational purposes. If this is the case, then JOSE is more appropriate.
Either way, but especially in the second scenario, more complete documenation is needed, since any user is going to need to:
The man pages cite Sutton and Barto (1998). Why not implement one of the examples from that text using the package, as a vignette? Especially if this is primarily a learning tool, this would be really useful, but even if designed as a problem-solving tool, it will greatly help users connect the tool to a textbook that more fully explains parameter choices and provides further examples.
Some options are never explained at all, such as iter
, which defaults to 1. When would higher values be useful?
Not every manual entry has examples, and often has very little discussion. Why not flesh this out, especially if the purpose is educational?
Admittedly, I am a bit obsessed with naming, but I find the function names unintuitive and inconsistent, either non-descriptive (e.g. policy
) or inconsistent: compounded nouns (e.g. epsilonGreedyActionSelection
) or verb-nouns (e.g. lookupLearningRule
). Not clear either which are data objects and which are functions.
The R code is nicely written and very readable - so if the purpose is educational then it will serve that well, for those looking at source code. Although: please add a space after if
.
On the a full check, I am seeing:
N checking top-level files Non-standard files/directories found at top level: ‘JOSS’
(add to .Rbuildignore
)
Warning: roxygen2 requires Encoding: UTF-8
(add this to DESCRIPTION
)
:wave: @nproellochs - please let me know when you've incorporated @kbenoit's feedback.
@nproellochs - could you give us an update on your progress?
Sorry to be slow on this, review by early next week at the latest. Been snowed under with something else but it's done now.
@arfon Sorry for the delay. I will give an update on this ASAP. Thank you @kbenoit & @terrytangyuan for the review and the valuable feedback.
@arfon Sorry for the delay. I will give an update on this ASAP. Thank you @kbenoit & @terrytangyuan for the review and the valuable feedback.
:wave: @nproellochs - friendly reminder to get back to this soon.
@arfon My feedback and opinion still remain unchanged for the reasons I mentioned earlier. I'd recommend @nproellochs submitting this article to other journals, e.g. JOSE mentioned above.
@arfon My feedback and opinion still remain unchanged for the reasons I mentioned earlier. I'd recommend @nproellochs submitting this article to other journals, e.g. JOSE mentioned above.
Understood. Thanks @terrytangyuan.
👋 @arfon - this review seems to have stalled...
@arfon I am planning to incorporate the feedback next weekend. Is there a deadline for the revision?
@arfon I am planning to incorporate the feedback next weekend. Is there a deadline for the revision?
No, the timing is up to you.
Please make sure though in your revisions you directly address the feedback from @kbenoit. Specifically:
Or, this is a purely R-based implementation of reinforcement learning that offers a way to test or take apart this class of machine learning model for primarily educational purposes. If this is the case, then JOSE is more appropriate.
In your response here I think you need to make a strong case for this being a JOSS submission rather than a JOSE one, especially given both reviewers have the same concern here.
Thank you again @kbenoit & @terrytangyuan for reviewing the package. As part of the revision, I have addressed the checklist issues as follows:
Version: Does the release version given match the GitHub release (v1.0.2)?
I added a tagged GitHub release matching the CRAN version (v1.0.4).
Functionality: Have the functional claims of the software been confirmed?
I have significantly improved the package documentation. In addition, the vignette now contains two working examples (adapted from Sutton (1998)) with code executions and detailed explanations of the data format and model outcomes.
A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
I added a statement of need section to the paper. In addition, the paper (and the package vignette) now contains a dedicated section addressing the performance of the package. Specifically, I added examples of things the package can do and the things the package cannot do (where users may consider implementations written in "faster" programming languages)
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
As aforementioned, I implemented two working examples in the package vignette. Specifically, I demonstrate the use of the package by drawing upon common examples from the literature (e.g. finding optimal game strategies).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
See points above. In addition, I added further documentation and additional examples to the manual entries.
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
I added community guidelines in the README.
Software paper
I added a statement of need highlighting our contributions to the existing landscape of R packages. In addition, the paper now includes the package authors' affiliations and DOIs.
Minor issues
I added further documentation and additional examples to the manual entries. In addition, I improved the consistency of function naming and added spaces after if
.
R CMD CHECK issues
Fixed.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
PDF failed to compile for issue #1087 with the following error:
/app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/lib/whedon.rb:115:in check_fields': Paper YAML header is missing expected fields: authors, affiliations (RuntimeError) from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/lib/whedon.rb:80:in
initialize'
from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/lib/whedon/processor.rb:36:in new' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/lib/whedon/processor.rb:36:in
set_paper'
from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/bin/whedon:55:in prepare' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:in
run'
from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in invoke_command' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:in
dispatch'
from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in start' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-379fcd917139/bin/whedon:116:in
<top (required)>'
from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in load' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Thanks for the update @nproellochs. @terrytangyuan @kbenoit - could you take a look at the updated paper ☝️and let me know if this satisfies your concerns about the usefulness of this software given both of your concerns about performance?
As a reminder, our submission requirements state that:
- The software should be a significant contribution to the available open source software that either enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)
From my perspective, if this is the first reinforcement learning package in R, then that could be a 'substantial contribution' almost by definition, however, if the performance is so poor that its not useful for all but a very small number of applications, then this may be a moot point.
👋 @terrytangyuan, @kbenoit — We are waiting for your assessment on this submission. Thanks!
Please see my previous comment.
I see that both reviewers have concerns about the purpose and need of this software… https://github.com/openjournals/joss-reviews/issues/1087#issuecomment-453893878 https://github.com/openjournals/joss-reviews/issues/1087#issuecomment-461296889
... and @arfon's assessment is:
if this is the first reinforcement learning package in R, then that could be a 'substantial contribution' almost by definition, however, if the performance is so poor that its not useful for all but a very small number of applications, then this may be a moot point.
The author says they have expanded in the paper the sections on Statement of Need, and Notes on Performance.
Does this new text in the paper satisfy the reviewers?
Then see my very first comment https://github.com/openjournals/joss-reviews/issues/1087#issuecomment-443436417, it is not the first reinforcement learning package in R.
@terrytangyuan Apart from the abovementioned differences regarding package features, I want to point out that the reinforcelearn R package was released approximately 11 months after the ReinforcementLearning package. Therefore, in my opinion, it is fair to claim that the ReinforcementLearning package is the first package that performs model-free RL in R.
That is not my main concern... @arfon @labarba Please un-assign me from this and assign new reviewers if needed. I don't have additional time dedicated to this review anymore. My apologies for any inconvenience this may cause.
Thank you very much for your feedback up to this point @terrytangyuan
@terrytangyuan @kbenoit - many thanks for all of your input to date and your excellent reviews. While there isn't clear agreement here between us, I'm satisfied with the responses of the author and am making the editorial decision to accept this submission into JOSS.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@nproellochs - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:
I can then move forward with accepting the submission.
Thank you @arfon and both reviewers! The archive is now available on Zenodo 10.5281/zenodo.3244170.
@whedon set 10.5281/zenodo.3244170 as archive
OK. 10.5281/zenodo.3244170 is the archive.
@whedon accept
Attempting dry run of processing paper acceptance...
OK DOIs
- 10.1111/ecog.00888 is OK
- 10.1038/nature14236 is OK
- 10.1109/TNN.1998.712192 is OK
MISSING DOIs
- https://doi.org/10.1109/21.364859 may be missing for title: A Sensor-Based Navigation for a Mobile Robot using Fuzzy Logic and Reinforcement Learning
- https://doi.org/10.1006/game.1997.0619 may be missing for title: Coordination, “Magic,” and Reinforcement Learning in a Market Entry Game
- https://doi.org/10.1007/978-3-642-27645-3_2 may be missing for title: Batch Reinforcement Learning
- https://doi.org/10.1007/978-1-4615-3618-5_5 may be missing for title: Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
- https://doi.org/10.1007/978-1-4757-6451-2_4 may be missing for title: Reinforcement Learning in the Multi-Robot Domain
- https://doi.org/10.1145/1143844.1143929 may be missing for title: Reinforcement Learning for Optimized Trade Execution
- https://doi.org/10.2139/ssrn.2602003 may be missing for title: Negation Scope Detection in Sentiment Analysis: Decision Support for News-Driven Trading
- https://doi.org/10.3115/1289189.1289246 may be missing for title: Automatic Learning of Dialogue Strategy using Dialogue Simulation and Reinforcement Learning
INVALID DOIs
- None
Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/749
If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/749, then you can now move forward with accepting the submission by compiling again with the flag deposit=true
e.g.
@whedon accept deposit=true
Submitting author: @nproellochs (Nicolas Proellochs) Repository: https://github.com/nproellochs/ReinforcementLearning Version: v1.0.2 Editor: @arfon Reviewer: @terrytangyuan, @kbenoit Archive: 10.5281/zenodo.3244170
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@terrytangyuan & @kbenoit, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @arfon know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @terrytangyuan
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
paper.md
file include a list of authors with their affiliations?Review checklist for @kbenoit
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
paper.md
file include a list of authors with their affiliations?