Repository:https://github.com/UBC-MDS/portugal_white_wine_quality_predictor_py.gitReport link: report/portugal_white_wine_quality_predictor_report.ipynb
Abstract/executive summary:
We tried to make the classification model using the Polynomial Regression with Ridge Regularization algorithm with Randomized Search Hyperparameters which can predict Portugal white wine quality rating (on scale 0-10) through the physicochemical properties of the test wine. The model has trained on the Portugal white wine data set with 4898 observations. In the conclusion, the model performance is not quite good enough both on training data and on an unseen test data set with the test score at around 0.32 with the average train at 0.35 and the average test at 0.27 also with high root MSE and MSE (Mean Squared Error).
This data set used in this project is related to white vinho verde wine samples from the north of Portugal created By P. Cortez, A. Cerdeira, Fernando Almeida, Telmo Matos, J. Reis. 2009. The dataset was sourced from website for downloading these datasets is the UC Irvine Machine Learning Repository. In addition, these datasets stored the physicochemical properties data on wines and the quality rating to compare and make the quality prediction model.
Editor: @Nicole-Tu97
Reviewer: jokittipong sho-i9
[x] I agree to abide by MDS's Code of Conduct during the review process and in maintaining my package should it be accepted.
Submitting authors: jokittipong Nicole-Tu97 sho-i9
Repository: https://github.com/UBC-MDS/portugal_white_wine_quality_predictor_py.git Report link: report/portugal_white_wine_quality_predictor_report.ipynb Abstract/executive summary: We tried to make the classification model using the Polynomial Regression with Ridge Regularization algorithm with Randomized Search Hyperparameters which can predict Portugal white wine quality rating (on scale 0-10) through the physicochemical properties of the test wine. The model has trained on the Portugal white wine data set with 4898 observations. In the conclusion, the model performance is not quite good enough both on training data and on an unseen test data set with the test score at around 0.32 with the average train at 0.35 and the average test at 0.27 also with high root MSE and MSE (Mean Squared Error).
This data set used in this project is related to white vinho verde wine samples from the north of Portugal created By P. Cortez, A. Cerdeira, Fernando Almeida, Telmo Matos, J. Reis. 2009. The dataset was sourced from website for downloading these datasets is the UC Irvine Machine Learning Repository. In addition, these datasets stored the physicochemical properties data on wines and the quality rating to compare and make the quality prediction model.
Editor: @Nicole-Tu97 Reviewer: jokittipong sho-i9