openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
725 stars 38 forks source link

[REVIEW]: AutoRA: Automated Research Assistant for Closed-Loop Empirical Research #6839

Open editorialbot opened 5 months ago

editorialbot commented 5 months ago

Submitting author: !--author-handle-->@musslick<!--end-author-handle-- (Sebastian Musslick) Repository: https://github.com/AutoResearch/autora-paper Branch with paper.md (empty if default branch): main Version: v4.0.0 Editor: !--editor-->@jbytecode<!--end-editor-- Reviewers: @seandamiandevine, @szorowi1 Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e"><img src="https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e/status.svg)](https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@seandamiandevine & @szorowi1, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jbytecode know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @seandamiandevine

📝 Checklist for @szorowi1

editorialbot commented 5 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.31222/osf.io/ysv2u is OK
- 10.1016/j.jbef.2017.12.004 is OK
- 10.48550/arXiv.1912.04871 is OK
- 10.48550/arXiv.2006.11287 is OK
- 10.1126/sciadv.aav6971 is OK
- 10.31234/osf.io/c2ytb is OK

MISSING DOIs

- No DOI given, and none found for title: Bayesian machine scientist for model discovery in ...
- No DOI given, and none found for title: An evaluation of experimental sampling strategies ...
- No DOI given, and none found for title: Scikit-learn: Machine learning in python
- No DOI given, and none found for title: A Unified Framework for Deep Symbolic Regression

INVALID DOIs

- None
editorialbot commented 5 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=0.01 s (547.6 files/s, 36692.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Markdown                         2             33              0            100
TeX                              1             14              0             90
YAML                             1              1              5             25
-------------------------------------------------------------------------------
SUM:                             4             48              5            215
-------------------------------------------------------------------------------

Commit count by author:

    11  Sebastian Musslick
     3  musslick
     2  Younes Strittmatter
editorialbot commented 5 months ago

Paper file info:

📄 Wordcount for paper.md is 1549

✅ The paper includes a Statement of need section

editorialbot commented 5 months ago

License info:

✅ License found: MIT License (Valid open source OSI approved license)

editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jbytecode commented 5 months ago

@seandamiandevine, @szorowi1 - Dear reviewers, you can start with creating your task lists. In that list, there are several tasks.

Whenever you perform a task, you can check on the corresponding checkbox. Since the review process of JOSS is interactive, you can always interact with the author, the other reviewers, and the editor during the process. You can open issues and pull requests in the target repo. Please mention the url of this page in there so we can keep tracking what is going on out of our world.

Please create your tasklist by typing

@editorialbot generate my checklist

Thank you in advance.

jbytecode commented 5 months ago

@editorialbot remind @szorowi1 in two weeks

editorialbot commented 5 months ago

Reminder set for @szorowi1 in two weeks

seandamiandevine commented 5 months ago

Review checklist for @seandamiandevine

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

szorowi1 commented 5 months ago

Review checklist for @szorowi1

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

editorialbot commented 5 months ago

:wave: @szorowi1, please update us on how your review is going (this is an automated reminder).

szorowi1 commented 5 months ago

Hi @jbytecode, hope you've been well! I'm working my way through the review. I was wondering if I could request some guidance for establishing functionality. The AutoRA library is quite extensive, distributed across 30+ python packages (though some are quite small, composed of only a few functions/classes). What would you consider to be sufficient for demonstrating functionality (e.g., working through the tutorials/examples in the docs, applying the software to a novel personal use case, etc.)? Thank you!

jbytecode commented 5 months ago

@musslick - Could you please provide guidelines and help our reviewer on the issue mentioned above?

@szorowi1 - Any critics/suggestions/corrections/thoughts are welcome. Following the checklist items are generally enough.

musslick commented 5 months ago

@jbytecode Sure thing!

@szorowi1 (also tagging @seandamiandevine ) Thanks for checking about this. We discussed with the development team what might be a good functionality test for AutoRA. We reached consensus that the Tutorials and the two examples Equation Discovery and Online Closed-Loop Discovery would capture most of the core functionality of AutoRA. So evaluating those might be most appropriate for a functionality test. Note that all of the tutorials (except the online closed-loop discovery) should be executable via Google Colab. Please let us know if you run into any issues or have any other questions---and thanks for all your work!

seandamiandevine commented 5 months ago

Thanks @musslick for the direction. It was very helpful in guiding functionality tests.

Checklist-related comments

General comments

jbytecode commented 4 months ago

@musslick - Could you please update your status and inform us how is your study going? Do we have any improvements in light of our reviewer's suggestions?

szorowi1 commented 4 months ago

Apologies to all for the delay, it's been a hectic few weeks!

Let me start by saying congrats to @musslick and co-authors/collaborators! This is a really impressive framework and it's obvious how much careful attention, thought, and effort went into developing it. Kudos!

I've now had a chance to work through the documentation, tutorials, and examples. The installation went fine, the code works as expected, and the Docs/API are robust. To echo @seandamiandevine, I also ran into a number of errors when running through the Equation Discovery tutorial in Collab having to do with data shape mismatches. When running the Experimentalist.ipynb tutorial notebook in Collab, I also ran into the following error early on:

ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-2-508cdcdb2e51>](https://localhost:8080/#) in <cell line: 4>()
      2 from sklearn.linear_model import LinearRegression
      3 from autora.variable import DV, IV, ValueType, VariableCollection
----> 4 from autora.experimentalist.sampler.falsification import falsification_sample, falsification_score_sample

ModuleNotFoundError: No module named 'autora.experimentalist.sampler'

I agree with @seandamiandevine it would be good to make sure the Example notebooks run all the way through for new users.

Some general thoughts, none of which should necessarily preclude acceptance or publication:

musslick commented 4 months ago

Dear @seandamiandevine and @szorowi1,

Thank you both so much for investing the time and effort into this review and for providing such thorough and constructive feedback. We really appreciate that!

I discussed your feedback with the team, and we agree that there is not sufficient (and complete) information about how to utilize the closed-loop functionality of AutoRA for real-world experiments. Adding respective examples in the documentation would be beneficial, especially for researchers interested in behavioral experiments.

We propose to do the following:

  1. Fix the errors in the Equation Discovery tutorial and the Experimentalist.ipynb notebook.
  2. Include the following two end-to-end examples for closed-loop experimentation with AutoRA (using a both Prolific, and Firebase):

    2.1 Mathematical model discovery for a psychophysics experiment

    2.2 Computational (reinforcement learning) model discovery for a one-armed bandit experiment

    Once we implemented and internally vetted those tutorials we would love to get your feedback on those. That said, we would also understand if you've had enough of AutoRA already and/or don't have the time ;)

    As a quick fix, we already expanded the Closed-Loop Online Experiment Example to include a description of how to combine AutoRA with Prolific (to address @seandamiandevine' initial point).

    In addition, to follow-up on the general thoughts from @szorowi1's, we would aim to include two additional examples for closed-loop experimentation (also using Prolific and Firebase). We may not be able to get them implemented over the course of the review process but wanted to hear your thoughts on whether these could be a useful target for our next development milestone:

    2.3 Drift diffusion model comparison for a random-object-kinematogram (RDK) experiment using Bayesian optimal experimental design (specifically minimizing posterior uncertainty)

    2.4 Experiment parameter tuning for a task-switching experiment (to illustrate how AutoRA can be used to for automated design optimization, e.g., to enhance a desired behavioral effect, such as task switch costs)

Finally, to address @szorowi1's question: We think that AutoRA could be used for design optimization (we could illustrate this in Example 2.4). However, it's not (yet) capable of adapting the experiment on the fly, i.e., within a single experiment session. Rather, it can help optimize design after collecting data from a set of experiments, and then proposing a new set of experiments.

Please let us know what you think!

szorowi1 commented 4 months ago

I think the plan above sounds great! I would definitely be happy to review 2.1/2.2 when they are ready. I also agree that 2.3/2.4 are fantastic tutorial examples but will require more time to develop. So, @musslick, will you let us know when 2.1/2.2 are ready and we can go from there?

musslick commented 4 months ago

That sounds great, @szorowi1 ! Thank you for being willing to take a second look. We will ping you both once 2.1 and 2.2 are ready for your review!

jbytecode commented 3 months ago

@musslick - May I request an update please? Thank you in advance.

musslick commented 3 months ago

@jbytecode Thanks for checking in. We created two new tutorials in response to the reviewers but are still in the process of incorporating them into the doc, and doing some last validation checks. We should have them up in two weeks!

jbytecode commented 2 months ago

@musslick - May I request an update please? Sorry if I am bothering. Thank you in advance.

musslick commented 1 month ago

@musslick - May I request an update please? Sorry if I am bothering. Thank you in advance.

Thanks for checking in. We are aiming to have the new release up by end of next week, and will ping you!

jbytecode commented 1 month ago

@musslick - Thank you for the status update. Good luck with your edits.

musslick commented 1 month ago

Dear @jbytecode @szorowi1 @seandamiandevine

Thank you so much for your patience with this revision. Your feedback has been incredibly valuable, helping us uncover multiple inconsistencies in the AutoRA documentation and concomitant opportunities for improvement. This led us to address some deeper issues in the core code base, which is why the revision process took a bit longer than anticipated.

Following your suggestions, we ended up re-structuring the tutorials into Basic Tutorials and Use Case Tutorials. The latter demonstrate practical applications of AutoRA in real-world scenarios. The most relevant changes include:

We invite you to have a look at these revised sections.

To ease your review: Running the revised tutorials shouldn't require any local installation. You should be able to execute both Basic Tutorials in Google Colab. In addition, you should be able run both Use Case Tutorials within GitHub Codespaces (we provide guidelines in the respective tutorials).

Thank you again for your your valuable time and effort. We look forward to your feedback!

jbytecode commented 3 weeks ago

@musslick - Thank you for the update.

@seandamiandevine, @szorowi1 - Dear editors, could you please update your reviews? Thank you in advance.

seandamiandevine commented 4 days ago

@jbytecode @musslick

Thank you for the revisions! After rerunning the tutorial, everything runs well for me. I also double-checked the local install and that also works for me.

The new Use Case examples are excellent and clearly address my initial concerns. They also serve as a good introduction to Firebase for researchers who are new to server-side development (if it can be called that).

Overall, I find that my comments have been addressed and I'm happy to recommend publication in the current form. @jbytecode, please let me know if I need to do anything else at this stage.

Thanks again for thinking of me as a reviewer and congratulations to @musslick and the team for their contribution!

jbytecode commented 3 days ago

Dear reviewers @seandamiandevine, @szorowi1

You have still unchecked review items in your checklists. Could you please finalize your reviews by checking them on?

Thank you in advance.

@seandamiandevine - Thank you for your review and the recommendation of acceptance.

jbytecode commented 3 days ago

@editorialbot check references

editorialbot commented 3 days ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.31222/osf.io/ysv2u is OK
- 10.1016/j.jbef.2017.12.004 is OK
- 10.48550/arXiv.1912.04871 is OK
- 10.48550/arXiv.2410.20268 is OK
- 10.48550/arXiv.2006.11287 is OK
- 10.1126/sciadv.aav6971 is OK
- 10.31234/osf.io/c2ytb is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Bayesian machine scientist for model discovery in ...
- No DOI given, and none found for title: An evaluation of experimental sampling strategies ...
- No DOI given, and none found for title: Scikit-learn: Machine learning in python
- No DOI given, and none found for title: Computational discovery of human reinforcement lea...
- No DOI given, and none found for title: A Unified Framework for Deep Symbolic Regression
- No DOI given, and none found for title: Automating the Practice of Science–Opportunities, ...

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None
jbytecode commented 3 days ago

@editorialbot generate pdf

editorialbot commented 3 days ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

musslick commented 13 hours ago

Dear @seandamiandevine Thank you so much for all the effort you put into this review and the previous one. I understand how much work it can be, especially given how long it took for our code base and documentation to evolve since the first review. We’re thrilled to hear that everything worked in the end, and are especially happy about the suggestion about the use case examples!