openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
725 stars 38 forks source link

[REVIEW]: Eaglescope: an interactive visualization and cohort selection tool for biomedical data exploration. #6837

Closed editorialbot closed 4 days ago

editorialbot commented 5 months ago

Submitting author: !--author-handle-->@birm<!--end-author-handle-- (Ryan Birmingham) Repository: https://github.com/sharmalab/eaglescope Branch with paper.md (empty if default branch): Version: 1.1.1 Editor: !--editor-->@csoneson<!--end-editor-- Reviewers: @flekschas, @sebastian-raubach Archive: 10.5281/zenodo.14040758

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae"><img src="https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae/status.svg)](https://joss.theoj.org/papers/229f0d11e01fb7316ef9da35d9e466ae)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@flekschas & @sebastian-raubach, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

πŸ“ Checklist for @sebastian-raubach

πŸ“ Checklist for @flekschas

editorialbot commented 5 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1126/scisignal.2004088 is OK
- 10.1109/VAHC.2017.8387496 is OK
- 10.1038/s41598-020-60981-9 is OK
- 10.1007/s10278-013-9622-7 is OK
- 10.1200/cci.20.00001 is OK

MISSING DOIs

- No DOI given, and none found for title: Bokeh: an interactive visualization library for mo...

INVALID DOIs

- None
editorialbot commented 5 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=1.05 s (1707.1 files/s, 238183.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                     101           6938          22483         133168
SVG                           1567              0             24          24921
JSON                            15              3              0          23135
CSS                             44           4888            551          21120
Sass                            19            519             34           4755
LESS                            18            504             55           4636
HTML                             8             69             53            673
CSV                              1              0              0            302
Markdown                         5             33              0            151
YAML                             5             15              4             79
TeX                              1              5              0             50
Dockerfile                       1              3              0             15
Bourne Shell                     1              0              0              3
-------------------------------------------------------------------------------
SUM:                          1786          12977          23204         213008
-------------------------------------------------------------------------------

Commit count by author:

   143  Ryan Birmingham
    36  Birm
    19  Yahia Zakaria
    17  Nan Li
     9  Jasox NaN
     7  Mohamed Nasser
     3  nanli-emory
     1  ArthurMor4is
     1  Pranav
     1  dependabot[bot]
editorialbot commented 5 months ago

Paper file info:

πŸ“„ Wordcount for paper.md is 522

βœ… The paper includes a Statement of need section

editorialbot commented 5 months ago

License info:

βœ… License found: BSD 3-Clause "New" or "Revised" License (Valid open source OSI approved license)

editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

csoneson commented 5 months ago

πŸ‘‹πŸΌ @birm, @flekschas, @sebastian-raubach - this is the review thread for the submission. All of our communications will happen here from now on.

As a reviewer, the first step is to create a checklist for your review by entering

@editorialbot generate my checklist

as the top of a new comment in this thread. These checklists contain the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. The first comment in this thread also contains links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues directly in the software repository. If you do so, please mention this thread so that a link is created (and I can keep an eye on what is happening). Please also feel free to comment and ask questions in this thread. It is often easier to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use EditorialBot (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@csoneson) if you have any questions or concerns. Thanks!

sebastian-raubach commented 5 months ago

Review checklist for @sebastian-raubach

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

sebastian-raubach commented 5 months ago

@birm Would you be able to clarify the author contribution for me please? Looking at the contribution graphs (https://github.com/sharmalab/eaglescope/graphs/contributors) I can see major code contributions from yourself, Nan Li (with two separate accounts?) and Yahia Zakaria. I can, however, not identify Tony Pan's contribution. Additionally, there are smaller code contributions from other individuals who aren't included in the author list.

If you could shine some light on those two items, that'd be great.

@csoneson As this is my first JOSS review, could you let me know if questions of this nature are best posted on this thread or (as the initial comment suggests) on the software's main repository?

Cheers.

csoneson commented 5 months ago

Hi @sebastian-raubach - I'd say that this type of question, which is not strictly related to the functionality or implementation of the software, but rather to the JOSS submission, are better posed in this thread. But there is no strict boundary, and as long as you mention the issues posted in the software repository here, we can keep track of them.

Also, in case it's useful re: your question above, here is a link to the JOSS authorship policy.

sebastian-raubach commented 5 months ago

Added an issue asking about documentation of the tool https://github.com/sharmalab/eaglescope/issues/118

birm commented 5 months ago

@birm Would you be able to clarify the author contribution for me please? Looking at the contribution graphs (https://github.com/sharmalab/eaglescope/graphs/contributors) I can see major code contributions from yourself, Nan Li (with two separate accounts?) and Yahia Zakaria. I can, however, not identify Tony Pan's contribution. Additionally, there are smaller code contributions from other individuals who aren't included in the author list.

If you could shine some light on those two items, that'd be great.

@csoneson As this is my first JOSS review, could you let me know if questions of this nature are best posted on this thread or (as the initial comment suggests) on the software's main repository?

Cheers.

I don't know why Nan used two different accounts, but that's accurate. Tony Pan is involved with the Eaglescope project in an advisory capacity, especially future planning.

sebastian-raubach commented 5 months ago

@csoneson I noticed this section on web-applications in the JOSS guidelines and I'm not convinced this tool fulfils either of the two requirements. Having said that, just because they used web-technologies, this doesn't necessarily make it a web-application as it doesn't need to be hosted on the web somewhere, you can just run it locally. So I'm a bit unsure about how to proceed. This is further complicated by the fact that I cannot see any form of automated testing in place for this tool which is one of the other checkboxes to tick in the list.

@birm would you be able to confirm whether or not there are automated tests for Eaglescope?

birm commented 5 months ago

The automated tests exist but are currently quite basic (triggered https://github.com/sharmalab/eaglescope/blob/main/.github/workflows/smoke_test.yml, which currently just checks if the code builds properly, and there's also a code style check.

flekschas commented 5 months ago

Review checklist for @flekschas

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

birm commented 5 months ago

Just to update everyone, I'm still working on addressing the comments/issues from @sebastian-raubach but have not had much time lately. I haven't fallen asleep on these suggestions.

flekschas commented 5 months ago

Thanks for putting together this tool @birm. It looks really promising! However, I have several remarks:

  1. While the README.md contains instructions how to set up the development environment. I can't seem to find many details explaining how to configure the dashboard. And according to https://docs.google.com/presentation/d/1zvXCeV-a8k4VercXsgFPTHqml7QwmDDu9snz6dhqIC4/edit#slide=id.g106c66ac588_0_3 the specification file can be quite complex. Moreover, the paper claims that one can "create an interactive dashboard based upon a configuration file and either an API or data file". How is one supposed to do the later? Unless I'm missing something, the README.md does speak to any of that. So at the moment I can't tick Example usage as I really don't know how to set up the tool myself.
  2. Similarly, while the repo contains a live demo, it's somewhat unclear to me what all the visualizations do and how they work together to solve some scientific questions. The video gets at that to some degree but it's a) 3 years old and b) only one minute long (and hence doesn't really cover much). There needs to be a bit more for me to check Functionality documentation
  3. As far as I can tell, there are no tests of any kind
  4. There are also no community guidelines
  5. While the authors provide a statement of need, the paper is currently lacking a section on how it compares to other charting libraries. (Hence I cannot check State of the field) Part of the issue here is also that I'm not clear on what the exact scope of this tool is. In the very first sentence the authors say "... and cohort selection tool designed for biomedical data exploration" and it appears as if this tool is designed to visualize cohort data. But later on cohort data is not mentioned anymore and instead the authors talk about "exploring large biomedical datasets". Unfortunately I don't believe that this tool works with all kinds of biomedical. In particular, I doubt that the tool supports large datasets given that we're talking about a static HTML web app that uses SVG for rendering. (One cannot render millions or billions of data points with SVG as are commonly found in genomics or single-cell biology.) I'm not bringing this up to say the tool isn't useful but I think the paper needs to be revised to be more specific as to what types of biomedical data are actually supported. Having done that it should also be easy to relate the tool to all the other biomedical visualization tools that are out there. (Speaking from the downloaded wine dataset, my current assumption is that this tool can explore small tabular data. I'm happy to be convinced otherwise but the working demos all point to small datasets.)

In summary, this truly looks like a neat tool but it (primarily) needs documentation and the exact use case its supports should be worked out more in the paper.

PS: After digging around in the code I found out how one can change the config. The wine, demo, and collection-vis configs work but it seems that clinical-vis-config.json and vis-config.json are broken (I only see a loading spinner). It's odd though to require an end-user to dig into the source code to specify the default config that's being loaded.

birm commented 5 months ago

We've added community guidelines, improved the documentation, and added (slightly) better testing for eaglescope.

I think we still need to address points 2 and 5 @flekschas

csoneson commented 5 months ago

@csoneson I noticed this section on web-applications in the JOSS guidelines and I'm not convinced this tool fulfils either of the two requirements. Having said that, just because they used web-technologies, this doesn't necessarily make it a web-application as it doesn't need to be hosted on the web somewhere, you can just run it locally. So I'm a bit unsure about how to proceed. This is further complicated by the fact that I cannot see any form of automated testing in place for this tool which is one of the other checkboxes to tick in the list.

@sebastian-raubach thanks for your comment and sorry for the delayed response, I was out of office for a few days. We have discussed it in the editor team and we do feel that this is technically in scope - however, as you (and @flekschas) have pointed out, implementing sufficient testing to be able to catch unexpected behaviour in a way that is as automated and comprehensive as possible will be essential.

@birm - I would also suggest providing direct links to tests, contribution guidelines and similar in the README, so that they are easy to find.

birm commented 5 months ago

I've added some more links/instructions to the readme! Thank you for the continued suggestions!

birm commented 4 months ago

@flekschas

5. don't believe that this tool works with all kinds of biomedical. In particular, I doubt that the tool supports large datasets given that we're talking about a static HTML web app that uses SVG for rendering.

I think this is fair. We're more interested in the flexibility of deployment than we are about optimizing for large data in the current form. Larger scale data via a series of data summary APIs has been on our roadmap for a but, but we have not yet taken much action to make this happen. I've changed the paper writeup to focus a little more on the tabular/cohort than claiming "large".

csoneson commented 4 months ago

πŸ‘‹πŸ» Just wanted to check in to see where things are at here. Let me know if you have any questions. Thanks!

sebastian-raubach commented 4 months ago

πŸ‘‹πŸ» Just wanted to check in to see where things are at here. Let me know if you have any questions. Thanks!

Hi Charlotte, for me things are kind of stuck on the automated testing criterion. I can see that there has been a little work done on that front but ultimately, only the smoke test (does it even compile and run?) and some very limited other test (does it show the correct page title and show one visualization element) are included. There are no tests that look into the functionality of the application, e.g. is the input data loaded correctly? does the filtering work? are interactions with the charts working as expected? is the exported configuration correct? None of these are covered.

Since you mentioned in one of your earlier replies the editorial team wanted to ensure that things like this are covered, which at this point they aren't.

There is also more that could be done on the documentation front. I am happy to see that the format and structure of the configuration files has been added as documentation, but there is no user-facing documentation when it comes to the use of the interface.

Finally, there is no "state of the field" section to speak of where the tool would be compared and evaluated against other tools in the area.

csoneson commented 4 months ago

Thanks for the summary @sebastian-raubach! I agree that these points are important.

birm commented 4 months ago

Thank you for the comments, I should have time to address some of these in the coming weeks! Thanks also for your patience!

flekschas commented 4 months ago

@editorialbot generate pdf

editorialbot commented 4 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

flekschas commented 4 months ago

I had another look at the repo and the paper and largely agree with @sebastian-raubach.

I cannot check "Functionality: Have the functional claims of the software been confirmed?" because there are no tests verifying the functionality. The smoke test is a great start but it really isn't anything other than a start. If there was a basic test for each chart type I'd say it's appropriately tested.

Similarly, while very basic documentation is now available, I still have to dig into the source code to understand how to configure each visualization. Similarly, community guidelines like a short "how to contribute guide" and issue/pr templates are not present. I checked "Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems)." because the demo page does provide an example and there are more under "config". However, it'd be nice if there was some sort of description added to the demo page explaining what data one is looking at. (I know it's the classic wine dataset but other visitors might not).

Finally, the paper edits are a step in the right direction but a comparison to other similar visualization software and papers for cohort visualization and dashboarding is still missing.

@birm could you take a look at each missing check box and ping us once you feel like you have addressed them? @csoneson I appreciate the progress the authors have made but there's quite a bit of work left to do.

birm commented 4 months ago

As a partial update, I've been working on this. I've added more tests in https://github.com/sharmalab/eaglescope/pull/127, but I suspect we could use quite a few more still.

I'm aware that I have a lot to cover here, and I appreciate the guidance and patience.

csoneson commented 3 months ago

Hi all - I just wanted to check in here again to make sure that everyone is on the same page, and not all awaiting updates from someone else πŸ™‚ Specifically, @birm - do I interpret your comment above correctly that you are currently working on additional expansions of the tests and documentation in order to fully address the reviewers' concerns?

birm commented 3 months ago

Yes, I'm working on it, albeit slower than I hoped. Thank you for your continued patience with me!

csoneson commented 3 months ago

@birm - that's great, and no worries at all! Just wanted to check in to make sure I was up to date on where things are at.

csoneson commented 2 months ago

Hi @birm - just wanted to check in to see whether you have an estimate of the timeline for the changes you are working on (as I mentioned, this is to make sure that submissions are actively worked on - we also have the possibility to pause a submission if it is likely that revisions will take a few months or so). Thanks!

csoneson commented 2 months ago

Ping @birm - could you give us an update on where things stand? Thanks!

birm commented 2 months ago

Apologies, let me take this as motivation to work on this today and tomorrow.

birm commented 2 months ago

Okay, I believe I've addressed most of the comments in some form between (tests) https://github.com/sharmalab/eaglescope/pull/127 and (paper/readme) https://github.com/sharmalab/eaglescope/pull/130. Thanks for the nudge to make this happen! Looking forward to more feedback. And once again, thanks for the patience.

csoneson commented 2 months ago

Thanks @birm - @flekschas, @sebastian-raubach, when you have a moment, could you please have a look at the revised submission and let us know whether your comments are addressed?

sebastian-raubach commented 2 months ago

I'm on holiday until the 23rd but will look at it when I get back.

csoneson commented 2 months ago

Thanks @sebastian-raubach - @flekschas, please let us know when you would have a moment to look at the revised submission!

sebastian-raubach commented 2 months ago

@editorialbot generate pdf

editorialbot commented 2 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

sebastian-raubach commented 2 months ago

Ok, I've had a chance to look over the proposed changes. Below are some comments, suggestions and questions.

I can see you added more tests, specifically, you added tests for each chart type. I think this is a good idea. I do, however, have a question regarding what the test actually checks. You're creating a chart with some very basic mock data, then after the chart is created, you check expect(document.getElementById(mockId)).toBeInTheDocument(); which (I assume) checks whether the given chart has been added to the HTML document. My question here is: Would this check ever fail? I.e. if the chart is created with invalid data, will .toBeInDocument() actually return a falsy value?

Ideally, I'd like to see more extensive testing of the functionality of each component and their connectivity via the filters, but I realise that this is a very involved task.

Manuscript line 28 should be "where the criteria are known".

Apart from these, I think the modifications made to the tool documentation and the manuscript improve the overall quality significantly.

flekschas commented 1 month ago

It'd really go a long way if the authors could lay out how they think they've addressed the concerns instead of having us to dig into the source code.

From my previous review:

I cannot check "Functionality: Have the functional claims of the software been confirmed?" because there are no tests verifying the functionality. The smoke test is a great start but it really isn't anything other than a start. If there was a basic test for each chart type I'd say it's appropriately tested.

Looks like each chart type now has an existence test associated to it. That's great! :tada: However, as @sebastian-raubach pointed out, it doesn't really test the functionality. Unless there's a unexpected runtime error, the test will always pass if I'm not mistaken. A very simply but effective approach to verifying that the components work as expected (and produce a chart of some type) would be to turn the HTML document into an image and compare it to a static assets. That shouldn't be hard to do with a library like https://www.npmjs.com/package/html-to-image. One could also do other functionality checks like verifying that a pie chart that should have 3 slices indeed has three SVG path elements or so.

Hence, I cannot check "Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?".

Similarly, while very basic documentation is now available, I still have to dig into the source code to understand how to configure each visualization.

The added table is acceptable documentation. Hence, I checked "Functionality documentation".

Community guidelines like a short "how to contribute guide" and issue/pr templates are not present

No news on this front as far as I know. Hence, I cannot check "Community guidelines".

Finally, the paper edits are a step in the right direction but a comparison to other similar visualization software and papers for cohort visualization and dashboarding is still missing.

The authors added the missing section. Hence, I checked "State of the field".

birm commented 1 month ago

Thank you both for nudging me to make the tests better; we've done https://github.com/sharmalab/eaglescope/pull/134 which doesn't test everything, but it at least checks generally that the elements expected within the visualization are present. The exception is the "ParallelCoordinates" chart which only uses a canvas; so far I'm just checking that there is indeed a canvas which isn't a very good test. A good next iteration of these tests should check how the rendered element actually looks using image comparison, but I have not implemented this yet. Currently it is almost entirely tests like "has the right number of pie chart slices/bars" and "renders the axes" etc.

We've also fixed the typo @sebastian-raubach pointed out. Thank you for your vigilance.

We've also added some basic templates and another copy of the creator covenant in https://github.com/sharmalab/.github

birm commented 1 month ago

Please let me know if anything further is needed from me at this time! @sebastian-raubach @flekschas @csoneson

csoneson commented 1 month ago

@sebastian-raubach, @flekschas - could you take a look at the additional tests added above (https://github.com/sharmalab/eaglescope/pull/134) to see what you think?

@birm: regarding the community guidelines - in addition to the code of conduct (for contributors), it would be useful to have an indication in the README to guide users who would like to report an issue or seek support.

birm commented 3 weeks ago

@csoneson -- good call. I've added a little bit to the readme under "Development" about this. https://github.com/sharmalab/eaglescope/pull/141

Thanks for the suggestion!

flekschas commented 3 weeks ago

All βœ… now.

sebastian-raubach commented 3 weeks ago

Same πŸ‘

csoneson commented 3 weeks ago

@sebastian-raubach, @flekschas - thanks a lot for your thorough and constructive reviews!

@birm - I will also take a quick look through the submission (likely early next week) and get back to you with the next steps.