openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
694 stars 36 forks source link

[REVIEW]: AutoRA: Automated Research Assistant for Closed-Loop Empirical Research #6839

Open editorialbot opened 1 month ago

editorialbot commented 1 month ago

Submitting author: !--author-handle-->@musslick<!--end-author-handle-- (Sebastian Musslick) Repository: https://github.com/AutoResearch/autora-paper Branch with paper.md (empty if default branch): main Version: v4.0.0 Editor: !--editor-->@jbytecode<!--end-editor-- Reviewers: @seandamiandevine, @szorowi1 Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e"><img src="https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e/status.svg)](https://joss.theoj.org/papers/be6d470033fbe5bd705a49858eb4e21e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@seandamiandevine & @szorowi1, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jbytecode know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

πŸ“ Checklist for @seandamiandevine

πŸ“ Checklist for @szorowi1

editorialbot commented 1 month ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 month ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.31222/osf.io/ysv2u is OK
- 10.1016/j.jbef.2017.12.004 is OK
- 10.48550/arXiv.1912.04871 is OK
- 10.48550/arXiv.2006.11287 is OK
- 10.1126/sciadv.aav6971 is OK
- 10.31234/osf.io/c2ytb is OK

MISSING DOIs

- No DOI given, and none found for title: Bayesian machine scientist for model discovery in ...
- No DOI given, and none found for title: An evaluation of experimental sampling strategies ...
- No DOI given, and none found for title: Scikit-learn: Machine learning in python
- No DOI given, and none found for title: A Unified Framework for Deep Symbolic Regression

INVALID DOIs

- None
editorialbot commented 1 month ago

Software report:

github.com/AlDanial/cloc v 1.90  T=0.01 s (547.6 files/s, 36692.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Markdown                         2             33              0            100
TeX                              1             14              0             90
YAML                             1              1              5             25
-------------------------------------------------------------------------------
SUM:                             4             48              5            215
-------------------------------------------------------------------------------

Commit count by author:

    11  Sebastian Musslick
     3  musslick
     2  Younes Strittmatter
editorialbot commented 1 month ago

Paper file info:

πŸ“„ Wordcount for paper.md is 1549

βœ… The paper includes a Statement of need section

editorialbot commented 1 month ago

License info:

βœ… License found: MIT License (Valid open source OSI approved license)

editorialbot commented 1 month ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jbytecode commented 1 month ago

@seandamiandevine, @szorowi1 - Dear reviewers, you can start with creating your task lists. In that list, there are several tasks.

Whenever you perform a task, you can check on the corresponding checkbox. Since the review process of JOSS is interactive, you can always interact with the author, the other reviewers, and the editor during the process. You can open issues and pull requests in the target repo. Please mention the url of this page in there so we can keep tracking what is going on out of our world.

Please create your tasklist by typing

@editorialbot generate my checklist

Thank you in advance.

jbytecode commented 1 month ago

@editorialbot remind @szorowi1 in two weeks

editorialbot commented 1 month ago

Reminder set for @szorowi1 in two weeks

seandamiandevine commented 4 weeks ago

Review checklist for @seandamiandevine

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

szorowi1 commented 3 weeks ago

Review checklist for @szorowi1

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

editorialbot commented 2 weeks ago

:wave: @szorowi1, please update us on how your review is going (this is an automated reminder).

szorowi1 commented 2 weeks ago

Hi @jbytecode, hope you've been well! I'm working my way through the review. I was wondering if I could request some guidance for establishing functionality. The AutoRA library is quite extensive, distributed across 30+ python packages (though some are quite small, composed of only a few functions/classes). What would you consider to be sufficient for demonstrating functionality (e.g., working through the tutorials/examples in the docs, applying the software to a novel personal use case, etc.)? Thank you!

jbytecode commented 2 weeks ago

@musslick - Could you please provide guidelines and help our reviewer on the issue mentioned above?

@szorowi1 - Any critics/suggestions/corrections/thoughts are welcome. Following the checklist items are generally enough.

musslick commented 2 weeks ago

@jbytecode Sure thing!

@szorowi1 (also tagging @seandamiandevine ) Thanks for checking about this. We discussed with the development team what might be a good functionality test for AutoRA. We reached consensus that the Tutorials and the two examples Equation Discovery and Online Closed-Loop Discovery would capture most of the core functionality of AutoRA. So evaluating those might be most appropriate for a functionality test. Note that all of the tutorials (except the online closed-loop discovery) should be executable via Google Colab. Please let us know if you run into any issues or have any other questions---and thanks for all your work!

seandamiandevine commented 1 week ago

Thanks @musslick for the direction. It was very helpful in guiding functionality tests.

Checklist-related comments

General comments

jbytecode commented 2 days ago

@musslick - Could you please update your status and inform us how is your study going? Do we have any improvements in light of our reviewer's suggestions?

szorowi1 commented 2 days ago

Apologies to all for the delay, it's been a hectic few weeks!

Let me start by saying congrats to @musslick and co-authors/collaborators! This is a really impressive framework and it's obvious how much careful attention, thought, and effort went into developing it. Kudos!

I've now had a chance to work through the documentation, tutorials, and examples. The installation went fine, the code works as expected, and the Docs/API are robust. To echo @seandamiandevine, I also ran into a number of errors when running through the Equation Discovery tutorial in Collab having to do with data shape mismatches. When running the Experimentalist.ipynb tutorial notebook in Collab, I also ran into the following error early on:

ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-2-508cdcdb2e51>](https://localhost:8080/#) in <cell line: 4>()
      2 from sklearn.linear_model import LinearRegression
      3 from autora.variable import DV, IV, ValueType, VariableCollection
----> 4 from autora.experimentalist.sampler.falsification import falsification_sample, falsification_score_sample

ModuleNotFoundError: No module named 'autora.experimentalist.sampler'

I agree with @seandamiandevine it would be good to make sure the Example notebooks run all the way through for new users.

Some general thoughts, none of which should necessarily preclude acceptance or publication:

musslick commented 9 hours ago

Dear @seandamiandevine and @szorowi1,

Thank you both so much for investing the time and effort into this review and for providing such thorough and constructive feedback. We really appreciate that!

I discussed your feedback with the team, and we agree that there is not sufficient (and complete) information about how to utilize the closed-loop functionality of AutoRA for real-world experiments. Adding respective examples in the documentation would be beneficial, especially for researchers interested in behavioral experiments.

We propose to do the following:

  1. Fix the errors in the Equation Discovery tutorial and the Experimentalist.ipynb notebook.
  2. Include the following two end-to-end examples for closed-loop experimentation with AutoRA (using a both Prolific, and Firebase):

    2.1 Mathematical model discovery for a psychophysics experiment

    2.2 Computational (reinforcement learning) model discovery for a one-armed bandit experiment

    Once we implemented and internally vetted those tutorials we would love to get your feedback on those. That said, we would also understand if you've had enough of AutoRA already and/or don't have the time ;)

    As a quick fix, we already expanded the Closed-Loop Online Experiment Example to include a description of how to combine AutoRA with Prolific (to address @seandamiandevine' initial point).

    In addition, to follow-up on the general thoughts from @szorowi1's, we would aim to include two additional examples for closed-loop experimentation (also using Prolific and Firebase). We may not be able to get them implemented over the course of the review process but wanted to hear your thoughts on whether these could be a useful target for our next development milestone:

    2.3 Drift diffusion model comparison for a random-object-kinematogram (RDK) experiment using Bayesian optimal experimental design (specifically minimizing posterior uncertainty)

    2.4 Experiment parameter tuning for a task-switching experiment (to illustrate how AutoRA can be used to for automated design optimization, e.g., to enhance a desired behavioral effect, such as task switch costs)

Finally, to address @szorowi1's question: We think that AutoRA could be used for design optimization (we could illustrate this in Example 2.4). However, it's not (yet) capable of adapting the experiment on the fly, i.e., within a single experiment session. Rather, it can help optimize design after collecting data from a set of experiments, and then proposing a new set of experiments.

Please let us know what you think!