UBC-MDS / DSCI_522_group09_Wine_Quality_Predictor

A machine learning pipeline for classification and prediction of wine quality based on relevant features
https://ubc-mds.github.io/DSCI_522_group09_Wine_Quality_Predictor/index.html
MIT License
4 stars 6 forks source link

Milestone 1 Review #90

Closed mohamad-amin closed 2 years ago

mohamad-amin commented 2 years ago
  1. Project proposal: reasoning Although the detailed motivation is appreciated, there are a few major problems with your proposal that I'll go through here:

    • How are you going to answer your question?
    • How does your method involve data science and using the related tools?
    • Are you planning on doing visualization? If so, how? What's your question there and how do you wanna address your concerns?
    • How would you evaluate your response to the question you mentioned?
  2. Exploratory data analysis in a literate code document: VIZ Your visualizations are nice! The only thing that I can think of is that you could track the relationship between different features as well. Also, you could check how the interaction of two or more features affects your target.

gfairbro commented 2 years ago

Hi @mohamad-amin mohamed, thanks for the feedback. Could you specify what we are missing below when it comes to your first two points? I feel like we address them here:

In order to answer the research question stated above, we plan to construct multiple classification machine learning models to predict on the quality of wines, and report the best model with highest metric score of interest. Before building any model, we plan to partition the data into a training split and a test split (70% for training and 30% for test). The next step will be performing analysis on the results to explore accuracy of our model as well as correlation between features and the resultant quality rating.

ML Model candidates include: kNN, SVM, logistic regression, decision tree. These candidates may vary but the intention is to evaluate 4._

and here:

After building a pipeline with the preprocessor and an estimator, we are going to carry out hyperparameter optimization through cross validation. We expect to have one graph showing the metric score (e.g. accuracy) vs. hyperparameter values for each estimator we will attempt. We also plan as a stretch goal to evaluate which features contribute most to quality by using feature coefficients which we will present as a visualization.

mohamad-amin commented 2 years ago

Sorry aren't the paragraphs that you mentioned from your report? Because report is for Milestone 2 and the information that I asked for is supposed to be in your proposal. Am I missing anything here?

gfairbro commented 2 years ago

As i understood it this feedback was for Milestone 1: https://github.com/UBC-MDS/DSCI_522_group09_Wine_Quality_Predictor/tree/0.0.1

is this feedback for the release for Milestone 2 then? I agree that we need to get that info back into the About section, for milestone 3, but it was there in Milestone 1.

gfairbro commented 2 years ago

@mohamad-amin Thanks for the feedback we have incorporated your suggestions.