openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: pyheartlib: A Python package for processing electrocardiogram signals #5792

Closed editorialbot closed 8 months ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@devnums<!--end-author-handle-- (Sadegh Mohammadi) Repository: https://github.com/devnums/pyheartlib Branch with paper.md (empty if default branch): paper Version: 1.22.0 Editor: !--editor-->@britta-wstnr<!--end-editor-- Reviewers: @cudmore, @LegrandNico, @Bsingstad Archive: 10.5281/zenodo.10730468

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/4c9af4eef164d4647174e8a225a416e8"><img src="https://joss.theoj.org/papers/4c9af4eef164d4647174e8a225a416e8/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/4c9af4eef164d4647174e8a225a416e8/status.svg)](https://joss.theoj.org/papers/4c9af4eef164d4647174e8a225a416e8)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@cudmore & @LegrandNico, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @britta-wstnr know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @cudmore

📝 Checklist for @Bsingstad

📝 Checklist for @LegrandNico

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.06 s (1057.2 files/s, 102360.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          32            646           1323           2512
Markdown                         9            113              0            187
YAML                             6             39             23            153
Jupyter Notebook                 4              0            535            140
TeX                              1              6              0            122
TOML                             1              5              0             45
DOS Batch                        1              8              1             27
reStructuredText                 6             25             35             22
Bourne Shell                     1              5              2             10
make                             1              4              5             10
-------------------------------------------------------------------------------
SUM:                            62            851           1924           3228
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1109/51.932724 is OK
- 10.1161/01.CIR.101.23.e215 is OK
- 10.13026/9njx-6322 is OK
- 10.13026/zzpx-h016 is OK
- 10.3758/s13428-020-01516-y is OK

MISSING DOIs

- 10.1163/1574-9347_dnp_e612900 may be a valid DOI for title: Keras

INVALID DOIs

- None
editorialbot commented 1 year ago

Wordcount for paper.md is 487

britta-wstnr commented 1 year ago

Hello again! 👋
 @LegrandNico and @cudmore FYI @devnums

This is the review thread for the paper. All of our higher-level communications will happen here from now on, review comments and discussion can happen in the repository of the project (details below).

📓 Please read the "Reviewer instructions & questions" in the comment from our editorialbot (above). ✅ All reviewers get their own checklist with the JOSS requirements - you generate them as per the details in the editorialbot comment. As you go over the submission, please check any items that you feel have been satisfied. 💻 The JOSS review is different from most other journals: The reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/5792 so that a link is created to this thread. That will also help me to keep track! ❓ Please also feel free to comment and ask questions on this thread. 🎯 We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

britta-wstnr commented 1 year ago

@editorialbot add @Bsingstad as reviewer

editorialbot commented 1 year ago

@Bsingstad added to the reviewers list!

britta-wstnr commented 1 year ago

Hi @cudmore, @LegrandNico, @Bsingstad,

let me know if you have any questions or anything is unclear regarding the next steps! It would be great if you could just generate your checklist so I can see that everything is in order 😄

cudmore commented 1 year ago

Review checklist for @cudmore

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Bsingstad commented 1 year ago

Review checklist for @Bsingstad

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Bsingstad commented 1 year ago

Hi. What is meant by "please mention the link to LINK.."

britta-wstnr commented 1 year ago

Good catch @Bsingstad - that was to be replace with this issue's number before posting. I updated the comment - thanks!

LegrandNico commented 1 year ago

Review checklist for @LegrandNico

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

LegrandNico commented 1 year ago

Hi @britta-wstnr , I have now completed my review and asked for changes in the 3 issues opened above.

devnums commented 1 year ago

Hi @LegrandNico, @britta-wstnr,

I have answered those issues, and have made some changes to the paper. Please let me know if there are any other issues.

Thank you🙏

britta-wstnr commented 1 year ago

@devnums and @LegrandNico: looks like things are going smoothly for you, I will of course leave it to @LegrandNico to say if the comments are fully answered.

Looks like @Bsingstad is also making progress already, and how are you doing @cudmore ? Don't hesitate to contact me if you have any questions or are stuck in the process! 🙂

LegrandNico commented 1 year ago

Thank you @devnums for your work on this. I have commented on some of the issues. Currently, the paper is still missing a side-by-side comparison between the deep learning methods that can be used and the most traditional peak detection algorithm that can be found in other packages. I think that is the main question that a potential user would have: why use a more complicated model if not for better performance/accuracy? But JOSS is not intended to host a method paper that would go too deep in such a comparison so I will let @britta-wstnr decide whether this is something that should be added to the paper or not.

britta-wstnr commented 1 year ago

Thanks for flagging this @LegrandNico - I will have a detailed look at this and let you know my decision.

Ping @Bsingstad and @cudmore, how are your reviews coming along?

britta-wstnr commented 1 year ago

@LegrandNico and @devnums, I looked into it and it seems to me like there is a motivation in the paper on which gap this toolbox is filling as compared to other toolboxes (i.e., there is no other toolboxes allowing deep learning models to be utilized). Such a "statement of need" is necessary, but to show that a toolbox is performing better in comparison to other toolboxes is not. As you already suspected, @LegrandNico, this is beyond the scope of the JOSS paper (but can still be a good idea to show e.g. in the documentation or a separate paper as users indeed might ask this question). So I guess the question then is: do you agree that there is a statement of need/statement of relation towards existing toolboxes in the paper? @LegrandNico Also curious what the other reviewers will say @BSingstad and @cudmore!

cudmore commented 1 year ago

Thank you for your excellent manuscript and important Python package.

I have made a number of GitHub issues I would like addressed ...

https://github.com/devnums/pyheartlib/issues/11

https://github.com/devnums/pyheartlib/issues/12

https://github.com/devnums/pyheartlib/issues/13

https://github.com/devnums/pyheartlib/issues/14

Here, I provide some critiques to improve your manuscript that are outside of github issues.

Topics mentioned in the manuscript that need to be added to the online documentation

As I read your manuscript, there are a number of details mentioned that I do not see in the online documentation. These details may be documented at the code level with docstring(s) but need to be broken out into user dcumentation.

I would like to see the following points expanded in the online documentation. If you incorporate these suggestions, I believe the Python package will be substantially more usefull.

1) You mention WFDB and NeuroKit python packages. You then potentially use them to load data. What does this raw data look like? Please provide some plots. Once your code solves the problem, what do the results look like? Some plots comparing the original data with your new results would be great.

2) "For the heartbeat classification task, which typically requires segmented and annotated heartbeat waveforms, the software preprocesses the input data and delivers datasets comprising waveforms and features"

What does the input for the 'heartbeat classification task' look like? Can we see the input data, segmented and heartbeat waveforms? Can we see "waveforms and features", etc, etc.? What is 'waveforms' and 'features'? Please provide us with more info on these objects/classes/data-strutures?

3) "For the classification of signal excerpts, e.g., arrhythmia classification, the software is designed to store metadata about the excerpts in order to reduce memory usage significantly, especially in situations that each excerpt is being used only once during model training"

Can we see an example of the "classification of signal excerpts"? What does a 'signal excerpt' look like? What does the classification of a signal excerpt look like? Can you also document how the software is "designed to store metadata"? What metadata are you storing about excerpts? Why do you need to reduce memory usage about an excerpt? Why is it a problem that each excerpt is being used only once during model training?

4) "By adjusting the parameters appropriately, the package provides a substantial quantity of data samples for training deep learning models. Moreover, it is feasible to incorporate RR-intervals in addition to waveforms and their computed features."

You provide a lot of example code but it is unclear to me where the 'parameters' are that a user can adjust 'apporpriately'? How does one incorporate 'RR-intervals', 'waveforms', and their 'computed features'. Can we get a jupyter notebook showing us how to do this?

5) "Another use case is when each excerpt has to be divided into smaller sub-segments each with a specific label, e.g., r-peak detection. The package delivers data samples by storing metadata about the excerpts and providing lists of labels as annotations for the excerpts."

I am following you here but you need to guide the reader with more detailed readthedoc documentation. Can we get an example where a user would 'divide each excerpt into smaller sub-segments - each with a specific label'? How does that relate to r-peak detection?

If most of my above comments are addressed, I think the final statement might be parsable, e.g. "The package delivers data samples by storing metadata about the excerpts and providing lists of labels as annotations for the excerpts."

Summary. Your manuscript, code, and documentation is excellent. I would like to see the above mentioned topics addressed in the online documentation. Basically, a number of more descriptive tutorials. Overall, very good work.

devnums commented 12 months ago

Thanks @cudmore for taking the time to review this. Your valuable comments and suggestions greatly improved the quality of the documentation.

I have modified the documentation to include the suggestions you offered in the GitHub issues.

Furthermore, in order to address your points, I have made the following changes to the documentation.

  1. Some plots and additional explanation regarding the input and output datasets have been added per your suggestion. Please refer to the introduction section of the documentation. Introduction

  2. Additional details regarding the classes have been provided, along with a figure depicting waveforms and features. Please refer to this part of the introduction.

  3. I have included a figure that shows a signal excerpt. It also includes a diagram illustrating excerpt classification. Furthermore, metadata and its benefit have been explained. Please refer to this part.

  4. In the introduction, I have added the relevant parameters for the main classes, including RhythmData and ECGSequence classes. link Additional details have been added to the second example, which illustrates how to use the RhythmData and ECGSequence classes. Please note that I have changed the name of ArrhythmiaData to RhythmData. However, the former name will still work.

  5. Additional details have been added to example-3 and example-4 (R-peak detection), which demonstrate the use of the RpeakData class.

Additionally, a Getting Started section has been added to help users with different use cases.

Thank you again for your insightful suggestions, I hope I have implemented all of them. Please let me know if there is anything I missed or if there is something I can improve.

britta-wstnr commented 11 months ago

@devnums FYI: I have contacted @BSingstad (a second time) via email in addition to pinging them here. I will look into how to proceed in case the reviewer does not reply soon.

britta-wstnr commented 11 months ago

@LegrandNico and @cudmore Thank you again for your thorough reviews. Not all check boxes in your review lists appear ticked off to me - can you give feedback to @devnums in case there is still missing things from your point of view after the last changes that @devnums did? Thanks! 🙏

devnums commented 11 months ago

@devnums FYI: I have contacted @Bsingstad (a second time) via email in addition to pinging them here. I will look into how to proceed in case the reviewer does not reply soon.

Thank you @britta-wstnr 🙏

britta-wstnr commented 11 months ago

Hi @cudmore and @LegrandNico 👋 @devnums has responded to your review, may I ping you to have a look? Both of you seem to still have open check boxes in your review checklist - can you double check? It would be great if you could either give feedback on what points are still open or briefly confirm here that your checklist is being completed.

Thanks a lot! 🙏

britta-wstnr commented 11 months ago

FYI, I just sent a quick reminder email. @LegrandNico has ticked all boxes - thank you!

Note: I will be out of office for winter break soon, so I might be slow to respond for a while. ❄️ ⛄

Bsingstad commented 10 months ago

Thank you, @devnums, for developing an interesting Python package.

I have created 3 GitHub issues I would like to address:

In addition, I have some other aspects of your work I also would like to address:

  1. In this work you only use the MIT BIH Arrhythmia Database, containing ECGs from 47 subjects, both in training, validation and test. I would like to know more about how you partitioned the data? Do you split the data patient-wise or can the data from the same patient be in both the training, validation and test? This should at least be stated in the documentation.
  2. In the processing steps you are using WFDB package to import data and annotations, and I can see that you are using some familiar filtering techniques to remove baseline wander and noise. Can you explain if your work contribute with something novel on the processing side?
  3. As far as I understand your model only allow one ECG lead at a time, which makes sense for many application like wearbles, etc. Have you done any experiments assessing wich lead the model perform best at? Would assume the leads that usually have prominent peaks (like lead I, II, V4, V5 and V6), but it would be interesting to see such results anyway.
  4. It would be interesting to see comparison between the results of your work and previous work (i.e rule-based R-peak detection). Maybe there are some benchmark task you can apply your model to.

In general, I think your paper is very short and this explains why important information about the dataset, your contribution in relation to previous contributions, implications, etc. are missing. However, the code you provided are fairly well organized and the aim of the software package, deep learning based R-peak detection, is very relevant and could potentially have a high impact in the field.

Bsingstad commented 10 months ago

@devnums FYI: I have contacted @Bsingstad (a second time) via email in addition to pinging them here. I will look into how to proceed in case the reviewer does not reply soon.

Thank you @britta-wstnr 🙏

Sorry for being impossible to reach the last 1-2 months. I have now provided my review

britta-wstnr commented 10 months ago

Hello everyone - just to say I am back in the office now. Looks like things are progressing, great! @Bsingstad thank you very much for your review, it's appreciated! 🙏 @devnums can you have a look at the points raised by @Bsingstad ?

devnums commented 9 months ago

Hello everyone - just to say I am back in the office now. Looks like things are progressing, great! @Bsingstad thank you very much for your review, it's appreciated! 🙏 @devnums can you have a look at the points raised by @Bsingstad ?

Hello @britta-wstnr I will answer as soon as possible, thank you🙏

britta-wstnr commented 9 months ago

Thanks, @devnums!

devnums commented 9 months ago

Thank you, @devnums, for developing an interesting Python package.

I have created 3 GitHub issues I would like to address:

In addition, I have some other aspects of your work I also would like to address:

  1. In this work you only use the MIT BIH Arrhythmia Database, containing ECGs from 47 subjects, both in training, validation and test. I would like to know more about how you partitioned the data? Do you split the data patient-wise or can the data from the same patient be in both the training, validation and test? This should at least be stated in the documentation.
  2. In the processing steps you are using WFDB package to import data and annotations, and I can see that you are using some familiar filtering techniques to remove baseline wander and noise. Can you explain if your work contribute with something novel on the processing side?
  3. As far as I understand your model only allow one ECG lead at a time, which makes sense for many application like wearbles, etc. Have you done any experiments assessing wich lead the model perform best at? Would assume the leads that usually have prominent peaks (like lead I, II, V4, V5 and V6), but it would be interesting to see such results anyway.
  4. It would be interesting to see comparison between the results of your work and previous work (i.e rule-based R-peak detection). Maybe there are some benchmark task you can apply your model to.

In general, I think your paper is very short and this explains why important information about the dataset, your contribution in relation to previous contributions, implications, etc. are missing. However, the code you provided are fairly well organized and the aim of the software package, deep learning based R-peak detection, is very relevant and could potentially have a high impact in the field.

Thank you @Bsingstad for taking the time to review this software package. I have answered the issues. Below, I provide answers to your points.

  1. Users can choose which records to use for training, validation, and testing. They can be from multiple raw datasets. Each record has the data for one patient. When using the BeatData class, users can also create intra-patient datasets. Therefore, a record can be utilized in both training and testing datasets.

  2. I have added functionality that will allow users to use their own custom preprocessing methods. Please see the link below. https://pyheartlib.readthedocs.io/en/latest/getting-started.html#optional-preprocessing

  3. The software supports multiple channels. The provided example only uses Lead II; however, users can design different models and use different channels or multiple channels at once.

  4. That's a possibility for research in that specific area. Here, I have provided an example usage for this software.

I appreciate you taking the time to read the paper and the documentation and provide feedback.

devnums commented 9 months ago

I would like to thank you all for taking the time to review the paper. Since I have edited the paper considering your comments, I am regenerating the PDF. 🙏

editorialbot commented 9 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

britta-wstnr commented 9 months ago

Hi @Bsingstad and @devnums, looks like you are making great progress in the reviewing and revision cycle. @Bsingstad could you please have a look and see if your review points were sufficiently worked on? Please don't forget to reflect this in your checklist as well. Thanks! 🙏

Bsingstad commented 9 months ago

Thank you, @devnums, for developing an interesting Python package. I have created 3 GitHub issues I would like to address:

In addition, I have some other aspects of your work I also would like to address:

  1. In this work you only use the MIT BIH Arrhythmia Database, containing ECGs from 47 subjects, both in training, validation and test. I would like to know more about how you partitioned the data? Do you split the data patient-wise or can the data from the same patient be in both the training, validation and test? This should at least be stated in the documentation.
  2. In the processing steps you are using WFDB package to import data and annotations, and I can see that you are using some familiar filtering techniques to remove baseline wander and noise. Can you explain if your work contribute with something novel on the processing side?
  3. As far as I understand your model only allow one ECG lead at a time, which makes sense for many application like wearbles, etc. Have you done any experiments assessing wich lead the model perform best at? Would assume the leads that usually have prominent peaks (like lead I, II, V4, V5 and V6), but it would be interesting to see such results anyway.
  4. It would be interesting to see comparison between the results of your work and previous work (i.e rule-based R-peak detection). Maybe there are some benchmark task you can apply your model to.

In general, I think your paper is very short and this explains why important information about the dataset, your contribution in relation to previous contributions, implications, etc. are missing. However, the code you provided are fairly well organized and the aim of the software package, deep learning based R-peak detection, is very relevant and could potentially have a high impact in the field.

Thank you @Bsingstad for taking the time to review this software package. I have answered the issues. Below, I provide answers to your points.

  1. Users can choose which records to use for training, validation, and testing. They can be from multiple raw datasets. Each record has the data for one patient. When using the BeatData class, users can also create intra-patient datasets. Therefore, a record can be utilized in both training and testing datasets.
  2. I have added functionality that will allow users to use their own custom preprocessing methods. Please see the link below. https://pyheartlib.readthedocs.io/en/latest/getting-started.html#optional-preprocessing
  3. The software supports multiple channels. The provided example only uses Lead II; however, users can design different models and use different channels or multiple channels at once.
  4. That's a possibility for research in that specific area. Here, I have provided an example usage for this software.

I appreciate you taking the time to read the paper and the documentation and provide feedback.

Thanks for you answers and I am happy to see the improvement of your manuscript

britta-wstnr commented 8 months ago

Looks like everyone concluded their reviews - all boxes checked. Thanks! 🙏 @cudmore @LegrandNico @Bsingstad

britta-wstnr commented 8 months ago

@editorialbot check references

editorialbot commented 8 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1109/51.932724 is OK
- 10.1161/01.CIR.101.23.e215 is OK
- 10.13026/9njx-6322 is OK
- 10.13026/zzpx-h016 is OK
- 10.3758/s13428-020-01516-y is OK
- 10.21105/joss.05411 is OK
- 10.1109/TBME.1985.325532 is OK
- 10.1109/CIC.2002.1166717 is OK

MISSING DOIs

- 10.1163/2214-8647_dnp_e612900 may be a valid DOI for title: Keras

INVALID DOIs

- None
britta-wstnr commented 8 months ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance

britta-wstnr commented 8 months ago

@devnums can you please:

after that I will check the paper as well and then we can move on to the other points of the checklist.

devnums commented 8 months ago

@britta-wstnr I checked everything. I made a minor change to one of the references. The reported missing DOI doesn't have a DOI, that's why I haven't included it. Thank you.

britta-wstnr commented 8 months ago

Thanks @devnums ! I will go over the paper now to double check. Then we can go onto the other points of the checklist.

britta-wstnr commented 8 months ago

@editorialbot generate pdf

editorialbot commented 8 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

britta-wstnr commented 8 months ago

Hi @devnums,

I double-checked the paper, and I have two small suggestions that I will just paste here: line 8: "for the diagnosis of various disease." --> "for the diagnosis of various diseases." line 37: "especially in situations that each excerpt is being used" --> "especially in situations where each excerpt is being used"

Further, for completeness, I confirm that both Tensorflow and Keras (which editorialbot complains about regarding DOI) are cited as per the softwares' FAQs/requirements.

@devnums, if you agree please fix the typos above. Could you then please go on with these points from our post-review checklist:

  • Make a release of the software with the latest changes from the review and post the version number here. This is the version that will be used in the JOSS paper.
  • Archive the release on Zenodo/figshare/etc and post the DOI here.
  • Make sure that the title and author list (including ORCIDs) in the archive match those in the JOSS paper.
  • Make sure that the license listed for the archive is the same as the software license.

Thanks! 🙏

devnums commented 8 months ago

Hi @britta-wstnr,

Thank you for your suggestions. I have fixed the typos.

The version of the software is 1.22.0

I have archived the software on Zenodo. The DOI is: 10.5281/zenodo.10730468

Thanks again 🙏

britta-wstnr commented 8 months ago

@editorialbot check references

britta-wstnr commented 8 months ago

@editorialbot generate pdf