openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
722 stars 38 forks source link

[REVIEW]: FuseMedML: a framework for accelerated discovery in machine learning based biomedicine #4943

Closed editorialbot closed 1 year ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@alex-golts<!--end-author-handle-- (Alex Golts) Repository: https://github.com/BiomedSciAI/fuse-med-ml Branch with paper.md (empty if default branch): joss_paper Version: 0.2.9 Editor: !--editor-->@jmschrei<!--end-editor-- Reviewers: @anupamajha1, @suragnair Archive: 10.5281/zenodo.7346694

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/4d0c5f077e6bc1ddc4c75e75774ce976"><img src="https://joss.theoj.org/papers/4d0c5f077e6bc1ddc4c75e75774ce976/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/4d0c5f077e6bc1ddc4c75e75774ce976/status.svg)](https://joss.theoj.org/papers/4d0c5f077e6bc1ddc4c75e75774ce976)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@anupamajha1 & @suragnair, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jmschrei know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

πŸ“ Checklist for @suragnair

πŸ“ Checklist for @anupamajha1

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.50 s (468.7 files/s, 177355.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JSON                             2              0              0          58591
Python                         193           4330           6493          15333
Markdown                        24            492              0           1457
Jupyter Notebook                 3              0           1134            502
TeX                              1             19              0            145
YAML                             6             11             10            126
Bourne Shell                     4             33             22            118
INI                              1             14              0             92
TOML                             1              1              0              2
-------------------------------------------------------------------------------
SUM:                           235           4900           7659          76366
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago

Wordcount for paper.md is 1078

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1117/12.2609385 is OK
- 10.1117/12.2613169 is OK
- 10.3390/cancers14163848 is OK
- 10.1007/978-3-030-87592-3_4 is OK
- 10.1007/978-3-030-87589-3_29 is OK
- 10.1007/978-3-030-98385-7_14 is OK
- 10.1148/radiol.220027 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jmschrei commented 1 year ago

Howdy @anupamajha1 and @suragnair!

Thanks for agreeing to review this submission.

The process for conducting a review is outlined above. Please run the command shown above to have @editorialbot generate your checklist, which will give a step-by-step process for conducting your review. Please check the boxes during your review to keep track, as well as make comments in this thread or open issues in the repository itself to point out issues you encounter. Keep in mind that our aim is to improve the submission to the point where it is of high enough quality to be accepted, rather than to provide a yes/no decision, and so having a conversation with the authors is encouraged rather than providing a single review post at the end of the process.

Here are the review guidelines: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html And here is a checklist, similar to above: https://joss.readthedocs.io/en/latest/review_checklist.html

Please let me know if you encounter any issues or need any help during the review process, and thanks for contributing your time to JOSS and the open source community!

suragnair commented 1 year ago

Review checklist for @suragnair

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

anupamajha1 commented 1 year ago

Review checklist for @anupamajha1

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

jmschrei commented 1 year ago

Hi @anupamajha1 and @suragnair, how are the reviews coming?

anupamajha1 commented 1 year ago

Issues opened regarding

  1. missing references.
  2. missing state of the field section in the paper
  3. adding an example to the fuse.dl section
alex-golts commented 1 year ago

I addressed all three issues by @anupamajha1 . please let me know if they can be closed or if there are any further comments.

anupamajha1 commented 1 year ago

@alex-golts: Thanks for addressing my comments. Here are some suggestions before I can finish my review checklist:

  1. I installed FuseMedML and tried running it on the multiple examples you provided. The download instructions for datasets other than MNIST need to be improved. Currently, the READMEs, for many examples, point to the website with the data. You should include steps for download in all the example data READMEs.
  2. The writing quality of the paper can be improved.
suragnair commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

alex-golts commented 1 year ago

@anupamajha1 Thanks for these suggestions!

  1. We added a bash script to the KNIGHT example to download the data and edited the README file correspondingly. The STOIC21 example already contains in the code a class that downloads the data. For the remaining examples, obtaining the data involves some form of manual registration, so we couldn't easily automate that.

  2. We revised the writing a bit, I hope it is somewhat better. you can see the changes in this PR. If you believe there are still any areas in the writing that could be improved upon, please feel free to let us know which ones specifically.

suragnair commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

anupamajha1 commented 1 year ago

@alex-golts: thanks for the updates! @jmschrei: I have finished the review checklist. Please let me know if anything else is needed.

suragnair commented 1 year ago

Some more suggestions for improving the paper further. Firstly, it would be helpful to have a "design philosophy" focused figure that outlines NDict and input/output of fuse models (can be borrowed from the video in the README).

It would help to make the following improvements:

alex-golts commented 1 year ago

Thanks for the suggestions @suragnair , I have addressed them as follows:

Please let me know if you have any more questions or suggestions

suragnair commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

alex-golts commented 1 year ago

Thanks for the suggestions @suragnair , I have addressed them as follows:

  • Added a figure with a diagram explaining the design philosophy
  • Added arrows as you suggested in Fig. 1 (now Fig. 2)
  • The "1" argument represents the number of input channels. we keep it currently for backward compatibility. I added a comment to the code in the figure explaining what it is.
  • Modified the caption of Fig. 3 (now Fig. 4) to explain EvaluatorDefault

Please let me know if you have any more questions or suggestions

@suragnair just an update about the 3rd bullet. we will support not having to specify the number of input channels (see https://github.com/BiomedSciAI/fuse-med-ml/pull/244). so I changed the figure accordingly. removed the comment, and it now doesn't have the "1" argument that you asked about.

suragnair commented 1 year ago

This is slightly open-ended and subjective. The Hello World notebook says "By the end of the session we hope you'll be familiar with basic Fuse's workflow and acknowledge its potential." However, after going through the paper, docs and notebooks, I am still not able to fully grasp the power of FuseMedML over typical workflows.

I fully understand that FuseMedML implements a lot of handy functions, especially for evaluation, and I can see how that by itself is useful for ML projects. However, I'm not fully convinced how the NDict concept enhances reproducibility by itself. One could argue that well documented individual functions/classes would probably be just as useful?

Would be interested to hear your thoughts on this. It would also be very helpful to add some more comments to the Hello World notebook, especially specifying how FuseMedML concepts enhance reproducibility over how someone would do it without FuseMedML.

alex-golts commented 1 year ago

The "hello world" notebook is meant to be simple and in simple cases like a standard MNIST classification pipeline it can be indeed a bit more challenging to demonstrate the particular usefulness of the fuse "philosophy" vs. standard approaches. But to give one example that may emphasize some of the benefits, even looking at just this simple example:

In the instantiation of BatchSamplerDefault, we pass a key name (string) that specifies the label by which to balance samples. This assumes the notion of a sample_dict that represents a data sample that will have this key. Once users across different projects agree on this methodology and use sample_dicts to store their data, it makes it easier to reuse code. For example, the code for creating a balanced sampler in another project would be exactly the same (with maybe only the key name being different). In a standard approach, different projects would have the data stored in different ways, so they might need to have a different implementation of this custom sampler per project, or they will need to agree on another convention to make the code fully reusable. The BatchSamplerDefault is indeed reusable between different examples that we have, you can see it used in the same way for example here and here.

I agree that it's possible to think of different conventions than NDict of course, and as long as those are used, it could be equally convenient to use other types of well documented classes or functions. So in that sense, NDict is a choice that we made that we find convenient. but the usefulness is in having many generic building blocks that all "comply" with this methodology. and if users decide to use it for their project from the start, then we believe they will find it convenient to reuse code from existing projects that use the same methodology.

We will add some comments along the "hello world" notebook to make this point more clear. And we also have a couple more thoughts about how to make this notebook better at showcasing the way fuse facilitates better code reuse. we're still discussing it and plan to make some improvements going forward.

suragnair commented 1 year ago

Thanks for the explanation and I agree that it would be great to make the benefits more obvious to potential users. As the above isn't a strict requirement in the checklist, I have signed off on all remaining items. @jmschrei my review is complete.

jmschrei commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1117/12.2609385 is OK
- 10.1117/12.2613169 is OK
- 10.3390/cancers14163848 is OK
- 10.1007/978-3-030-87592-3_4 is OK
- 10.1007/978-3-030-87589-3_29 is OK
- 10.1007/978-3-030-98385-7_14 is OK
- 10.1148/radiol.220027 is OK
- 10.5281/zenodo.3828935 is OK

MISSING DOIs

- None

INVALID DOIs

- https://doi.org/10.48550/arXiv.2211.02701 is INVALID because of 'https://doi.org/' prefix
jmschrei commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jmschrei commented 1 year ago

@alex-golts can you fix that one invalid DOI? Everything else looks good to me. Can you also provide me with a Zenodo DOI for the paper of the form 10.xxxx/zenodo.xxxxxxx and a version number for the software? Thanks!

alex-golts commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1117/12.2609385 is OK
- 10.1117/12.2613169 is OK
- 10.3390/cancers14163848 is OK
- 10.1007/978-3-030-87592-3_4 is OK
- 10.1007/978-3-030-87589-3_29 is OK
- 10.1007/978-3-030-98385-7_14 is OK
- 10.1148/radiol.220027 is OK
- 10.5281/zenodo.3828935 is OK
- 10.48550/arXiv.2211.02701 is OK

MISSING DOIs

- None

INVALID DOIs

- None
alex-golts commented 1 year ago

@jmschrei I have fixed the invalid DOI.

can you please clarify what exactly you mean by "Zenodo DOI for the paper"? I see according to the submission instructions that we're expected to deposit the repository to a service like Zenodo. did you mean that? in that case we already have it here. Or did you mean to just deposit the paper? If that's the case, should it be the PDF? the raw "paper.md" and "paper.bib" files?

jmschrei commented 1 year ago

Yes, that's what I was looking for. Thanks. Is the version you'd like tagged with the submission v0.2.9?

alex-golts commented 1 year ago

Yes, that's what I was looking for. Thanks. Is the version you'd like tagged with the submission v0.2.9?

yes, v0.2.9 sorry, forgot to answer you on that one :-)

jmschrei commented 1 year ago

@editorialbot set v0.2.9 as version

editorialbot commented 1 year ago

Done! version is now v0.2.9

jmschrei commented 1 year ago

@editorialbot set 10.5281/zenodo.7346694 as archive

editorialbot commented 1 year ago

Done! Archive is now 10.5281/zenodo.7346694

jmschrei commented 1 year ago

@editorialbot recommend-accept

editorialbot commented 1 year ago
Attempting dry run of processing paper acceptance...
editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1117/12.2609385 is OK
- 10.1117/12.2613169 is OK
- 10.3390/cancers14163848 is OK
- 10.1007/978-3-030-87592-3_4 is OK
- 10.1007/978-3-030-87589-3_29 is OK
- 10.1007/978-3-030-98385-7_14 is OK
- 10.1148/radiol.220027 is OK
- 10.5281/zenodo.3828935 is OK
- 10.48550/arXiv.2211.02701 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 1 year ago

:wave: @openjournals/bcm-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/3887, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

Kevin-Mattheus-Moerman commented 1 year ago

@editorialbot set 0.2.9 as version

editorialbot commented 1 year ago

Done! version is now 0.2.9

Kevin-Mattheus-Moerman commented 1 year ago

@alex-golts and co-authors, I am the AEiC on this track and here to help process final steps towards acceptance in JOSS. Before we can proceed, below are some some minor issues that require your attention:

On the paper (You can call @editorialbot generate pdf to update the draft):

A data pipeline may consist of a static_pipline and a dynamic_pipeline.

Check if static_pipline should read static_pipeline.

On the archive:

Kevin-Mattheus-Moerman commented 1 year ago

@jmschrei thanks for your help as editor here. Note the archive issues I am pointing out ☝️ , in the future these are things you can check before recommending accept. But no worries we'll get it sorted out now. Thanks again for the help! πŸŽ‰

alex-golts commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left: