Open MDSFusionist opened 7 months ago
This project exemplifies a highly standardized approach to data analysis and software development, aligning with best practices in several key areas:
High code quality, functionality documentation and follow the community guidelines. These aspects are vital for maintaining and scaling the software efficiently, ensuring its long-term viability.
The code submitted ensures data accessibility by providing a complete computational methods and putting details, functions. These features significantly enhance the reproducibility and reliability of the research, which are cornerstones of scientific rigor.
Reporting: The comprehensive approach to reporting, including clear articulation of research questions, background, functions, and coupled with high-quality writing and complete referencing .This demonstrates an exemplary standard in code communication.
In summary, this project stands out for comprehensive documentation, code quality, reproducibility, and thorough analysis reporting.
1.5 hours
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Overall, the project is conducted in high-quality and I'm deeply impressed by your hard work.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
2hrs
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
I really enjoy going through this project and gain some knowledge about Cardiovascular Disease. The overall structure of the project is well organized. Both figures and models have sufficient descriptions.
Installation: Some issues were encountered when i runned the script. EDA.py: "ValueError: Saving charts in 'png' format requires the vl-convert-python or altair_saver package: see http://github.com/altair-viz/altair_saver/"
Reference: In the Data section, the citation for Tsao201 is mis-spelling, it should be Tsao2015 instead.
Visualization: It would be more obvious to add legend explanation in figure 5 to different what does disease 0 and 1 mean.
Abbreviation: A lot of terminology are used in the report, it would be great if maintaining consistency in the use of abbreviations, rather than alternating with their full names.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Submitting authors: @sandygross Sandy Gross @MDSFusionist Doris Wang @hema2022ubc He Ma @joeywwwu Joey Wu
Repository: https://github.com/UBC-MDS/CardioPredict Report link: https://ubc-mds.github.io/CardioPredict/heart_analysis_report.html Abstract/executive summary: Cardiovascular disease (CVD) remains a leading cause of mortality globally, necessitating the development of accurate predictive tools for early detection and intervention. This study utilizes a practice dataset from the renowned Framingham Heart Study (FHS) comprising clinical, demographic, and behavioral variables from patients at risk of CVD. The study employed a methodological approach centered on hyperparameter optimization of the k-Nearest Neighbors (kNN) algorithm, supplemented by an oversampling technique to address class imbalances and improve model sensitivity. Despite modest levels of accuracy (0.623) and recall (0.552), our model underscores the significance of cholesterol levels and smoking habits as substantial contributors to cardiovascular disease risk, alongside established factors such as age and systolic blood pressure. These insights pave the way for future investigations into the complex interplay of causal factors, intending to refine the predictive accuracy and clinical utility of the model.
Editor: @MDSFusionist Doris Wang Reviewer: <@carrieyanyi Yan Carris> <@Rachel0619 Rachel LI> <@shawnhu444 Shawn Hu> <@sungg888 Ruocong Sun>