Peer Review Feedback - Githubissues

General

Hello, Group 17. Congratulation on your work on this heart disease predictor. Below are my comments based on your project!

Data analysis review checklist

Reviewer: @lukeyf

Conflict of interest

[x] As the reviewer I confirm that I have no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the MDS code of conduct.

General checks

[x] Repository: Is the source code for this data analysis available? Is the repository well-organized and easy to navigate?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?

Comments:

The src contains concisely the four files that were used for the pipeline of analysis. The structure is clear and no files are too deep from the root of the project.

Documentation

[x] Installation instructions: Is there a clearly stated list of dependencies?
[x] Example usage: Do the authors include examples of how to use the software to reproduce the data analysis?
[ ] Functionality documentation: Is the core functionality of the data analysis software documented to a satisfactory level?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
Comments:

Usages are properly documented; however, the instruction does not match the actual scripts in the repository (for example the instruction says download_data.py whereas the src contains fetch_data.py). If your code is under development, do not forget to update the Readme after modifications.

Code quality

[x] Readability: Are scripts, functions, objects, etc., well named? Is it relatively easy to understand the code?
[x] Style guidelides: Does the code adhere to well known language style guides?
[x] Modularity: Is the code suitably abstracted into scripts and functions?
[x] Tests: Are there automated tests or manual steps described so that the function of the software can be verified? Are they of sufficient quality to ensure software robsutness?

Comments:

Yep. Functions are well-written and well-documented. The scripts are modular with helper functions.

Reproducibility

[x] Data: Is the raw data archived somewhere? Is it accessible?
[x] Computational methods: Is all the source code required for the data analysis available?
[x] Conditions: Is there a record of the necessary conditions (software dependencies) needed to reproduce the analysis? Does there exist an easy way to obtain the computational environment needed to reproduce the analysis?
[ ] Automation: Can someone other than the authors easily reproduce the entire data analysis?

Comments:

The source code in src is clear which file to call. I was able to execute until the analysis. But when I was trying to generate the report it returns the error pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. I was not sure if this was only my machine so if others returns the similar problem please note on that.

Analysis report

[x] Authors: Does the report include a list of authors with their affiliations?
[x] What is the question: Do the authors clearly state the research question being asked?
[x] Importance: Do the authors clearly state the importance for this research question?
[x] Background: Do the authors provide sufficient background information so that readers can understand the report?
[x] Methods: Do the authors clearly describe and justify the methodology used in the data analysis? Do the authors communicate any assumptions or limitations of their methodologies?
[x] Results: Do the authors clearly communicate their findings through writing, tables and figures?
[x] Conclusions: Are the conclusions presented by the authors correct?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
[x] Writing quality: Is the writing of good quality, concise, engaging?

Comments:

Writing was coherent and concise. The eda was not too overwhelming and the result is clear. However, I notice that in your book.pdf one of the table is cutoff because it was too long. I suggest removing some of the unnecessary contents like standard deviation to only reveal the meet (test/train scores)

Estimated hours spent reviewing: 1

Review Comments:

Overall, the project is in a good shape towards completion. The scripts are very solid and the analysis was quite insightful. There are a few things I mentioned in the previous comments and if you have time you can consider addressing them.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

UBC-MDS / heart_disease_predictor

Peer Review Feedback #35

General

Data analysis review checklist

Reviewer: @lukeyf

Conflict of interest

Code of Conduct

General checks

Comments:

Documentation

Comments:

Code quality

Comments:

Reproducibility

Comments:

Analysis report

Comments:

Estimated hours spent reviewing: 1

Review Comments:

Attribution