Open ttimbers opened 2 years ago
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
I found the read me file done incredibly well! I was able to follow very easily and it works seamlessly well on my machine. They even provided instructions on how to resolve potential conflicts with jupyter lab's ports. The whole project is easily reproducible and the it contains everything a person would need to understand the project, including the original dataset (which is in the data/raw path).
There are several problems with rendering: For example, the references are not printed, I think it's because they didn't specify which references they used so there's currently nothing being printed out. Also, In their breast_cancer_prediction.md file I could see they were trying to neglect the codes in the actual printed file but it's not working.
Regarding the graphs, the box-plots in the analysis didn't have color lables on them, which is kind of confusing to understand. Also, the boxplots are displaying in a weird way that many of them have the benign portion as more of a scatterplot.
I found the style generally easy to follow, but there're several things that could be improved. For example, not every subtitles are formatted correctly in their report and there are a few typos here and there.
Overall, they clearly put a lot of work into the project, I'm especially impressed by the reliable workflow they created.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
1.) Style guidelines: I believe there is a roam for improvement for the style guidelines on the script files, to be more specific like commenting part of codes to explain briefly what a chunk of code is doing, for easy understanding and following the code.
2.) For the data part, couldn't find the source of the raw data, in the make file it's more of like reading the CSV from the directory and then doing the analysis, for some readers this might not be super transparent.
3.) In the analysis file couldn't find the authors, so couldn't check that box
4.) Overall well done, really loved the analysis and the methodology used! Loved the use of pipelines, making some of the code simpler and avoiding redundancy.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
-overall great job! Your report was interesting to read, and easy to follow. -the Docker instructions were easy to follow and worked well, making the project reproducible which is a very important aspect!
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
The read me file was very well written. I was able to follow along without any hastle in terms of running the analysis. I particularly would like to see a "requirements" section particularly the space needed for the docker image since I had to abort my docker pull midway as I ran out of room.
within the jupyter environment, I noticed that my index.html file didnt show properly (just loaded the raw html file) when I opened it in a new browser tab. However when I open the file within jupyter, I was prompted with a login to the file. However it seems that when I click the breast_cancer_prediction.html file it rendered the actual jupyer book within jupyter. I think this might be the file you were actually referring to? Either way I would consider adding a couple of lines in your README.md file to address this painpoint.
the analysis is well separated and organized properly. The overall flow of reading the analysis was easy to follow and well documented. I especially liked how you bolded the key informations like the performance of the models, or the use of text highlighting to differentiate the functions handles from the rest of the text.
in your source code I noticed that the functions are not documented. Although a descriptive naming scheme was chosen I think especially for audibility to include docstring at the beginning of the function to define what the function does. ref: https://www.codingem.com/python-how-to-document-functions/#:~:text=A%20Python%20docstring%20is%20a,do%20in%20your%20projects%20too.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Submitting authors: @edile47 @clichyclin @nhantien @ClaudioETC
Repository: https://github.com/DSCI-310/DSCI-310-Group-5
Abstract/executive summary: The project seeks to provide a solution to the prediction problem of spotting benign and malignant tumors, which comes from the question "Is there a way to efficiently classify whether a tumor is malignant or benign with high accuracy, given a set of different features observed from the tumor in its development stage?". Such problem was resolved using a predictive model. Our initial hypothesis was that it is possible to do so yet it would have a high error rate due to tumors features' variations. After performing EDA, such as summary statistics and data cleaning and visualization, we were able to spot some clear distinctions between benign and malignant tumors in some features. We then tested multiple different classification models and arrived at a K-Nearest-Neighbor model with tuned hyperparameters with very good accuracy, recall, precision and f1 score.
Editor: @ttimbers
Reviewer: @TimothyZG @hmartin11 @poddarswakhar @nkoda