Open lichunubc opened 12 months ago
The link to the report should is not a pdf or HTML. You provided a link to the ipynb file which can be viewed but I suggest making a GitHub page and linking the HTML file as the final report.
You have the LICENSE.md file but do not have the Creative Commons license in the file. (I also do not have this in my group, and it was not made clear on how to do this but I think it is just adding some more things to your LICENSE.md file).
I tried cloning your repo and following the steps in your readme to run the analysis but was unable to run docker compose up
because there was no compose.yml file in the root directory. Maybe you misplaced it?
The figures shown in the report are nice and clear. I like the EDAs that you did along with the colour scheme used to match red and white wine. However, a thing to note is that for the histograms it might be better to unstack
the histograms as we are comparing values between the two groups and not the sum of the values. Another thing to note is that the figure to show the coefficients of the model would be clearer if you sorted the coefficients in either ascending or descending order. A final thing on the figures is that the figure labels do not seem to be rendering correctly (both in the ipynb preview and the local HTML file). Maybe you could look into that.
Nit:
helper_...
in their name and some do not)Overall, I learned a great deal from your analysis and your test scores are really good (compared to mine and a lot of other groups). I think you did a great job determining which features were important in determining red or white wine.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
Your project showcases a commendable example of well-organized classification, offering valuable insights. I'd like to highlight the positive aspects while suggesting areas for improvement.
Positive points:
Thorough Exploratory Data Analysis (EDA): The depth and comprehensiveness of the EDA significantly contribute to the project's quality.
High Test Score: Achieving a high test score demonstrates the efficacy of the classification approach.
Clear and Concrete Conclusion: The project effectively concludes with precise classification results.
Areas for improvement:
Link reference.bib to the notebook: Associating the reference.bib file with the notebook will improve documentation and citation practices.
Create an HTML file of the project: Generating an HTML file will facilitate easy sharing and viewing of the project.
Consider resorting the plots: Reordering the plots can enhance readability and comprehension, offering a more structured visualization.
Host the 'build' folder (Jupyter Book): Utilizing the build folder by hosting it would make the project more accessible and navigable.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Good: The question and report are easy to follow and the analysis is clear and concise.
The goal of the analysis and method used is easy to understand and interpret.
I like how the document has a section for adding new dependencies in case individuals trying to run the project need it.
Room for improvement: The html file on my computer was not showing some of the plots.
I found the explanation for the tests to be vague in both the readme and the actual test scripts.
The container did not work when I tried to build it.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
helper_
, and test_
or tests_
and others having a different name convention. docker-compose.yml
file.This was derived from the JOSE review checklist and the ROpenSci review checklist.
Submitting authors: <jinyz8888> <jcairn02> <chrisgqy> <lichunubc>
Repository: https://github.com/UBC-MDS/2023-DSCI522-Group22 Report link: https://github.com/UBC-MDS/2023-DSCI522-Group22/blob/main/report/wine_color_classification_report.ipynb
Disclaimer: We would like to bring to your attention that the final report for our group project is currently presented as a Jupyter Notebook file and not as a published Jupyter Book HTML file. The HTML which can be locally rendered can be found under /report/_build/html. We are actively working to publish the rendered HTML version on the appropriate platform and consider this task a top priority for our group.
We sincerely apologize for any inconvenience this may cause and appreciate your understanding and patience as we finalize the publication process.
Abstract/executive summary: Our analysis aimed to develop a predictive model to distinguish between red and white wines based on various physicochemical properties. This study employed logistic regression, a model renowned for its balance between predictive power and interpretability. The regression result suggested that residual sugar and total sulfur dioxide had high positive coefficients, indicating a strong association with white wine, whereas density showed the most substantial negative impact, followed by alcohol and volatile acidity, suggesting these are key indicators of red wine.
Editor: @lichunubc Reviewer: @farrandi, @MoNorouzi23, Arturo Rey Hagga, Paolo De Lagrave-Codina,