[x] Repository: Is the source code for this data analysis available? Is the repository well organized and easy to navigate?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
Documentation
[x] Installation instructions: Is there a clearly stated list of dependencies?
[x] Example usage: Do the authors include examples of how to use the software to reproduce the data analysis?
[x] Functionality documentation: Is the core functionality of the data analysis software documented to a satisfactory level?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
Code quality
[x] Readability: Are scripts, functions, objects, etc., well named? Is it relatively easy to understand the code?
[x] Style guidelides: Does the code adhere to well known language style guides?
[x] Modularity: Is the code suitably abstracted into scripts and functions?
[x] Tests: Are there automated tests or manual steps described so that the function of the software can be verified? Are they of sufficient quality to ensure software robsutness?
Reproducibility
[x] Data: Is the raw data archived somewhere? Is it accessible?
[x] Computational methods: Is all the source code required for the data analysis available?
[x] Conditions: Is there a record of the necessary conditions (software dependencies) needed to reproduce the analysis? Does there exist an easy way to obtain the computational environment needed to reproduce the analysis?
[x] Automation: Can someone other than the authors easily reproduce the entire data analysis?
Analysis report
[x] Authors: Does the report include a list of authors with their affiliations?
[x] What is the question: Do the authors clearly state the research question being asked?
[x] Importance: Do the authors clearly state the importance for this research question?
[x] Background: Do the authors provide sufficient background information so that readers can understand the report?
[x] Methods: Do the authors clearly describe and justify the methodology used in the data analysis? Do the authors communicate any assumptions or limitations of their methodologies?
[x] Results: Do the authors clearly communicate their findings through writing, tables and figures?
[x] Conclusions: Are the conclusions presented by the authors correct?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
[x] Writing quality: Is the writing of good quality, concise, engaging?
This team has done a great job on this project! The repository was clear and well structured to explore all the components of the project, and the instructions on the README.md file are detailed and easy to follow along. The instructions allowed me to clone and and reproduce the analysis using the 'make commands'. The project was building off of docker without any bugs. The report clearly stated the predictive question and is quite comprehensive and easy to understand
Components to improve on
On exploring the project, I came across a file called check.py. I had some problems understanding what the file is doing. I would suggest to add some documentation or some comments to help an external reader understand the file better.
There were a few places where .ipynb_checkpoint was present. Although I don't think it matters much, but I would suggest you to either hide this folder using gitignore or maybe remove this folder since it causes unnecessary repetition of the ipynb report.
I was quite impressed by the effort this team put for using branches. It showed that the team followed the course/project guidelines thoroughly. However, I would suggest you to keep closing those branches that are bug-free and have been merged into main. It would be very time consuming for the reader in the case if he/she wanted to explore all the branches.
There were a few plots that were described as plot1, plot2, etc, while some plots had descriptive titles/name. I feel that this is not a consistent approach and would suggest you to follow either of the conventions to help the external readers understand your project better.
Overall, great job! You guys have adhered to the guidelines and have created a very well structured project. I feel the suggestions are just some minor changes to the repository and can be fixed quickly. I, also, liked that you guys have used R makrdown to render your report since it allows the results of R code to be directly inserted into formatted documents.
Data analysis review checklist
Reviewer: Jaskaran1116
Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1 hour 15 minutes
Review Comments:
Components that are constructed well
Components to improve on
On exploring the project, I came across a file called check.py. I had some problems understanding what the file is doing. I would suggest to add some documentation or some comments to help an external reader understand the file better.
There were a few places where .ipynb_checkpoint was present. Although I don't think it matters much, but I would suggest you to either hide this folder using gitignore or maybe remove this folder since it causes unnecessary repetition of the ipynb report.
I was quite impressed by the effort this team put for using branches. It showed that the team followed the course/project guidelines thoroughly. However, I would suggest you to keep closing those branches that are bug-free and have been merged into main. It would be very time consuming for the reader in the case if he/she wanted to explore all the branches.
There were a few plots that were described as plot1, plot2, etc, while some plots had descriptive titles/name. I feel that this is not a consistent approach and would suggest you to follow either of the conventions to help the external readers understand your project better.
Overall, great job! You guys have adhered to the guidelines and have created a very well structured project. I feel the suggestions are just some minor changes to the repository and can be fixed quickly. I, also, liked that you guys have used R makrdown to render your report since it allows the results of R code to be directly inserted into formatted documents.
Attribution
This was derived from the JOSE review checklist and the ROpenSci review checklist