openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
703 stars 36 forks source link

[REVIEW]: Eaglescope: an interactive visualization and cohort selection tool for biomedical data exploration. #6837

Open editorialbot opened 3 months ago

editorialbot commented 3 months ago

Submitting author: !--author-handle-->@birm<!--end-author-handle-- (Ryan Birmingham) Repository: https://github.com/sharmalab/eaglescope Branch with paper.md (empty if default branch): Version: 1.0 Editor: !--editor-->@csoneson<!--end-editor-- Reviewers: @flekschas, @sebastian-raubach Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae"><img src="https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae/status.svg)](https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@flekschas & @sebastian-raubach, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

πŸ“ Checklist for @sebastian-raubach

πŸ“ Checklist for @flekschas

editorialbot commented 3 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 3 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1126/scisignal.2004088 is OK
- 10.1109/VAHC.2017.8387496 is OK
- 10.1038/s41598-020-60981-9 is OK
- 10.1007/s10278-013-9622-7 is OK
- 10.1200/cci.20.00001 is OK

MISSING DOIs

- No DOI given, and none found for title: Bokeh: an interactive visualization library for mo...

INVALID DOIs

- None
editorialbot commented 3 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=1.05 s (1707.1 files/s, 238183.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                     101           6938          22483         133168
SVG                           1567              0             24          24921
JSON                            15              3              0          23135
CSS                             44           4888            551          21120
Sass                            19            519             34           4755
LESS                            18            504             55           4636
HTML                             8             69             53            673
CSV                              1              0              0            302
Markdown                         5             33              0            151
YAML                             5             15              4             79
TeX                              1              5              0             50
Dockerfile                       1              3              0             15
Bourne Shell                     1              0              0              3
-------------------------------------------------------------------------------
SUM:                          1786          12977          23204         213008
-------------------------------------------------------------------------------

Commit count by author:

   143  Ryan Birmingham
    36  Birm
    19  Yahia Zakaria
    17  Nan Li
     9  Jasox NaN
     7  Mohamed Nasser
     3  nanli-emory
     1  ArthurMor4is
     1  Pranav
     1  dependabot[bot]
editorialbot commented 3 months ago

Paper file info:

πŸ“„ Wordcount for paper.md is 522

βœ… The paper includes a Statement of need section

editorialbot commented 3 months ago

License info:

βœ… License found: BSD 3-Clause "New" or "Revised" License (Valid open source OSI approved license)

editorialbot commented 3 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

csoneson commented 3 months ago

πŸ‘‹πŸΌ @birm, @flekschas, @sebastian-raubach - this is the review thread for the submission. All of our communications will happen here from now on.

As a reviewer, the first step is to create a checklist for your review by entering

@editorialbot generate my checklist

as the top of a new comment in this thread. These checklists contain the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. The first comment in this thread also contains links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues directly in the software repository. If you do so, please mention this thread so that a link is created (and I can keep an eye on what is happening). Please also feel free to comment and ask questions in this thread. It is often easier to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use EditorialBot (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@csoneson) if you have any questions or concerns. Thanks!

sebastian-raubach commented 3 months ago

Review checklist for @sebastian-raubach

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

sebastian-raubach commented 3 months ago

@birm Would you be able to clarify the author contribution for me please? Looking at the contribution graphs (https://github.com/sharmalab/eaglescope/graphs/contributors) I can see major code contributions from yourself, Nan Li (with two separate accounts?) and Yahia Zakaria. I can, however, not identify Tony Pan's contribution. Additionally, there are smaller code contributions from other individuals who aren't included in the author list.

If you could shine some light on those two items, that'd be great.

@csoneson As this is my first JOSS review, could you let me know if questions of this nature are best posted on this thread or (as the initial comment suggests) on the software's main repository?

Cheers.

csoneson commented 3 months ago

Hi @sebastian-raubach - I'd say that this type of question, which is not strictly related to the functionality or implementation of the software, but rather to the JOSS submission, are better posed in this thread. But there is no strict boundary, and as long as you mention the issues posted in the software repository here, we can keep track of them.

Also, in case it's useful re: your question above, here is a link to the JOSS authorship policy.

sebastian-raubach commented 3 months ago

Added an issue asking about documentation of the tool https://github.com/sharmalab/eaglescope/issues/118

birm commented 3 months ago

@birm Would you be able to clarify the author contribution for me please? Looking at the contribution graphs (https://github.com/sharmalab/eaglescope/graphs/contributors) I can see major code contributions from yourself, Nan Li (with two separate accounts?) and Yahia Zakaria. I can, however, not identify Tony Pan's contribution. Additionally, there are smaller code contributions from other individuals who aren't included in the author list.

If you could shine some light on those two items, that'd be great.

@csoneson As this is my first JOSS review, could you let me know if questions of this nature are best posted on this thread or (as the initial comment suggests) on the software's main repository?

Cheers.

I don't know why Nan used two different accounts, but that's accurate. Tony Pan is involved with the Eaglescope project in an advisory capacity, especially future planning.

sebastian-raubach commented 3 months ago

@csoneson I noticed this section on web-applications in the JOSS guidelines and I'm not convinced this tool fulfils either of the two requirements. Having said that, just because they used web-technologies, this doesn't necessarily make it a web-application as it doesn't need to be hosted on the web somewhere, you can just run it locally. So I'm a bit unsure about how to proceed. This is further complicated by the fact that I cannot see any form of automated testing in place for this tool which is one of the other checkboxes to tick in the list.

@birm would you be able to confirm whether or not there are automated tests for Eaglescope?

birm commented 3 months ago

The automated tests exist but are currently quite basic (triggered https://github.com/sharmalab/eaglescope/blob/main/.github/workflows/smoke_test.yml, which currently just checks if the code builds properly, and there's also a code style check.

flekschas commented 2 months ago

Review checklist for @flekschas

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

birm commented 2 months ago

Just to update everyone, I'm still working on addressing the comments/issues from @sebastian-raubach but have not had much time lately. I haven't fallen asleep on these suggestions.

flekschas commented 2 months ago

Thanks for putting together this tool @birm. It looks really promising! However, I have several remarks:

  1. While the README.md contains instructions how to set up the development environment. I can't seem to find many details explaining how to configure the dashboard. And according to https://docs.google.com/presentation/d/1zvXCeV-a8k4VercXsgFPTHqml7QwmDDu9snz6dhqIC4/edit#slide=id.g106c66ac588_0_3 the specification file can be quite complex. Moreover, the paper claims that one can "create an interactive dashboard based upon a configuration file and either an API or data file". How is one supposed to do the later? Unless I'm missing something, the README.md does speak to any of that. So at the moment I can't tick Example usage as I really don't know how to set up the tool myself.
  2. Similarly, while the repo contains a live demo, it's somewhat unclear to me what all the visualizations do and how they work together to solve some scientific questions. The video gets at that to some degree but it's a) 3 years old and b) only one minute long (and hence doesn't really cover much). There needs to be a bit more for me to check Functionality documentation
  3. As far as I can tell, there are no tests of any kind
  4. There are also no community guidelines
  5. While the authors provide a statement of need, the paper is currently lacking a section on how it compares to other charting libraries. (Hence I cannot check State of the field) Part of the issue here is also that I'm not clear on what the exact scope of this tool is. In the very first sentence the authors say "... and cohort selection tool designed for biomedical data exploration" and it appears as if this tool is designed to visualize cohort data. But later on cohort data is not mentioned anymore and instead the authors talk about "exploring large biomedical datasets". Unfortunately I don't believe that this tool works with all kinds of biomedical. In particular, I doubt that the tool supports large datasets given that we're talking about a static HTML web app that uses SVG for rendering. (One cannot render millions or billions of data points with SVG as are commonly found in genomics or single-cell biology.) I'm not bringing this up to say the tool isn't useful but I think the paper needs to be revised to be more specific as to what types of biomedical data are actually supported. Having done that it should also be easy to relate the tool to all the other biomedical visualization tools that are out there. (Speaking from the downloaded wine dataset, my current assumption is that this tool can explore small tabular data. I'm happy to be convinced otherwise but the working demos all point to small datasets.)

In summary, this truly looks like a neat tool but it (primarily) needs documentation and the exact use case its supports should be worked out more in the paper.

PS: After digging around in the code I found out how one can change the config. The wine, demo, and collection-vis configs work but it seems that clinical-vis-config.json and vis-config.json are broken (I only see a loading spinner). It's odd though to require an end-user to dig into the source code to specify the default config that's being loaded.

birm commented 2 months ago

We've added community guidelines, improved the documentation, and added (slightly) better testing for eaglescope.

I think we still need to address points 2 and 5 @flekschas

csoneson commented 2 months ago

@csoneson I noticed this section on web-applications in the JOSS guidelines and I'm not convinced this tool fulfils either of the two requirements. Having said that, just because they used web-technologies, this doesn't necessarily make it a web-application as it doesn't need to be hosted on the web somewhere, you can just run it locally. So I'm a bit unsure about how to proceed. This is further complicated by the fact that I cannot see any form of automated testing in place for this tool which is one of the other checkboxes to tick in the list.

@sebastian-raubach thanks for your comment and sorry for the delayed response, I was out of office for a few days. We have discussed it in the editor team and we do feel that this is technically in scope - however, as you (and @flekschas) have pointed out, implementing sufficient testing to be able to catch unexpected behaviour in a way that is as automated and comprehensive as possible will be essential.

@birm - I would also suggest providing direct links to tests, contribution guidelines and similar in the README, so that they are easy to find.

birm commented 2 months ago

I've added some more links/instructions to the readme! Thank you for the continued suggestions!

birm commented 2 months ago

@flekschas

5. don't believe that this tool works with all kinds of biomedical. In particular, I doubt that the tool supports large datasets given that we're talking about a static HTML web app that uses SVG for rendering.

I think this is fair. We're more interested in the flexibility of deployment than we are about optimizing for large data in the current form. Larger scale data via a series of data summary APIs has been on our roadmap for a but, but we have not yet taken much action to make this happen. I've changed the paper writeup to focus a little more on the tabular/cohort than claiming "large".

csoneson commented 2 months ago

πŸ‘‹πŸ» Just wanted to check in to see where things are at here. Let me know if you have any questions. Thanks!

sebastian-raubach commented 2 months ago

πŸ‘‹πŸ» Just wanted to check in to see where things are at here. Let me know if you have any questions. Thanks!

Hi Charlotte, for me things are kind of stuck on the automated testing criterion. I can see that there has been a little work done on that front but ultimately, only the smoke test (does it even compile and run?) and some very limited other test (does it show the correct page title and show one visualization element) are included. There are no tests that look into the functionality of the application, e.g. is the input data loaded correctly? does the filtering work? are interactions with the charts working as expected? is the exported configuration correct? None of these are covered.

Since you mentioned in one of your earlier replies the editorial team wanted to ensure that things like this are covered, which at this point they aren't.

There is also more that could be done on the documentation front. I am happy to see that the format and structure of the configuration files has been added as documentation, but there is no user-facing documentation when it comes to the use of the interface.

Finally, there is no "state of the field" section to speak of where the tool would be compared and evaluated against other tools in the area.

csoneson commented 2 months ago

Thanks for the summary @sebastian-raubach! I agree that these points are important.

birm commented 1 month ago

Thank you for the comments, I should have time to address some of these in the coming weeks! Thanks also for your patience!

flekschas commented 1 month ago

@editorialbot generate pdf

editorialbot commented 1 month ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

flekschas commented 1 month ago

I had another look at the repo and the paper and largely agree with @sebastian-raubach.

I cannot check "Functionality: Have the functional claims of the software been confirmed?" because there are no tests verifying the functionality. The smoke test is a great start but it really isn't anything other than a start. If there was a basic test for each chart type I'd say it's appropriately tested.

Similarly, while very basic documentation is now available, I still have to dig into the source code to understand how to configure each visualization. Similarly, community guidelines like a short "how to contribute guide" and issue/pr templates are not present. I checked "Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems)." because the demo page does provide an example and there are more under "config". However, it'd be nice if there was some sort of description added to the demo page explaining what data one is looking at. (I know it's the classic wine dataset but other visitors might not).

Finally, the paper edits are a step in the right direction but a comparison to other similar visualization software and papers for cohort visualization and dashboarding is still missing.

@birm could you take a look at each missing check box and ping us once you feel like you have addressed them? @csoneson I appreciate the progress the authors have made but there's quite a bit of work left to do.

birm commented 1 month ago

As a partial update, I've been working on this. I've added more tests in https://github.com/sharmalab/eaglescope/pull/127, but I suspect we could use quite a few more still.

I'm aware that I have a lot to cover here, and I appreciate the guidance and patience.

csoneson commented 1 month ago

Hi all - I just wanted to check in here again to make sure that everyone is on the same page, and not all awaiting updates from someone else πŸ™‚ Specifically, @birm - do I interpret your comment above correctly that you are currently working on additional expansions of the tests and documentation in order to fully address the reviewers' concerns?

birm commented 1 month ago

Yes, I'm working on it, albeit slower than I hoped. Thank you for your continued patience with me!

csoneson commented 1 month ago

@birm - that's great, and no worries at all! Just wanted to check in to make sure I was up to date on where things are at.

csoneson commented 1 week ago

Hi @birm - just wanted to check in to see whether you have an estimate of the timeline for the changes you are working on (as I mentioned, this is to make sure that submissions are actively worked on - we also have the possibility to pause a submission if it is likely that revisions will take a few months or so). Thanks!